Glossary

Temperature

A positive scaling factor applied to logits before softmax to make the next-token distribution sharper or flatter.

What It Is

Temperature is a positive scalar used at inference (and sometimes in analysis) as softmax(z / T). It does not change model weights—it only rescales logits immediately before normalization.

Why It Matters

Creative writing APIs often raise temperature for diverse outputs; factual or code assistants lower it for deterministic answers. Because temperature acts on logits before softmax, it directly controls how sharply probabilities concentrate on top tokens.

Simple Example

With logits [2, 1, 0], temperature 0.5 divides by a smaller T, exaggerating differences so softmax favors index 0 even more. Temperature 2.0 dampens gaps so the three probabilities move closer together.

Common Confusions

Temperature is not the same as top-p or top-k filtering—those truncate or renormalize after probabilities exist. Setting temperature to zero is handled as a limit toward argmax in many libraries, not as literal division by zero.

Tags

References