Temperature

A positive scaling factor applied to logits before softmax to make the next-token distribution sharper or flatter.

What It Is

Temperature is a positive scalar used at inference (and sometimes in analysis) as softmax(z / T). It does not change model weights—it only rescales logits immediately before normalization.

Why It Matters

Creative writing APIs often raise temperature for diverse outputs; factual or code assistants lower it for deterministic answers. Because temperature acts on logits before softmax, it directly controls how sharply probabilities concentrate on top tokens.

Simple Example

With logits [2, 1, 0], temperature 0.5 divides by a smaller T, exaggerating differences so softmax favors index 0 even more. Temperature 2.0 dampens gaps so the three probabilities move closer together.

Common Confusions

Temperature is not the same as top-p or top-k filtering—those truncate or renormalize after probabilities exist. Setting temperature to zero is handled as a limit toward argmax in many libraries, not as literal division by zero.

Temperature

What It Is

Why It Matters

Simple Example

Common Confusions

Tags

References

On this page

Temperature

What It Is

Why It Matters

Simple Example

Common Confusions

Related Concepts And Modules

Tags

References

On this page