Glossary

Representation

An internal encoding of input data that downstream layers or tasks use instead of raw features.

What It Is

Representations are tensors produced inside the model: hidden states, latent codes, or pooled vectors used for classification, generation, or retrieval. They sit between raw modality inputs and task heads. Multiple modules may read and write representations across depth.

Why It Matters

Clear representation language helps you compare transfer learning, contrastive training, and generative latent spaces without conflating them with tokenizer outputs or public embedding products.

Simple Example

After a transformer block processes token embeddings, the hidden state at each position is a representation. A sentence classifier might pool those states; a generative model feeds them to the next block and the output head.

Common Confusions

Representation versus embedding: an embedding often means a lookup from discrete IDs to vectors at the input boundary, while representation is the broader term for internal encodings at any depth. Not every embedding table output is the final representation used for downstream tasks. Representation is also not modality—modality names the data channel; representation names the encoded form inside the network.

Tags

References