Architecture

The blueprint that defines how a model's layers, modules, and data flow connect during inference.

What It Is

An architecture specifies layer types, connectivity, and the order of computation. Transformers, U-Net diffusion stacks, and mixture-of-experts layouts are architectures. Many model checkpoints can instantiate the same architecture at different scales.

Why It Matters

Knowing the architecture tells you which modules to expect, how memory scales with sequence length, and which family pages in this atlas apply before you open a specific checkpoint.

Simple Example

A decoder-only transformer architecture stacks token embeddings, repeated attention-plus-FFN blocks, and a final language-model head. GPT-2, Llama, and many chat models share that shape even when widths and depths differ.

Common Confusions

Architecture is broader than a single module: attention is a module inside many architectures. It is also narrower than "model": you can train many models on one architecture. A model family name often implies an architecture but may bundle training data and tooling too.

Architecture

What It Is

Why It Matters

Simple Example

Common Confusions

Tags

References

On this page

Architecture

What It Is

Why It Matters

Simple Example

Common Confusions

Related Concepts And Modules

Tags

References

On this page