Glossary
Overfitting
When a model fits training noise so closely that held-out or production data perform worse—the classic gap between memorization and useful generalization.
What It Is
Overfitting is poor generalization caused by excessive fit to the training distribution: the model minimizes training loss by memorizing examples instead of learning stable features. High capacity, noisy labels, and too many training steps relative to data size are common drivers.
Simple Example
A classifier reaches 99% training accuracy but 60% on a validation set because it memorized background textures in training photos. Adding augmentation, early stopping, or weight decay widens the train–validation gap in the right direction.
Common Confusions
Overfitting is not the same as underfitting or pure distribution shift: shift can hurt test performance even without memorization. Low training loss alone is not proof of overfitting—you need a comparison set. Memorizing public benchmarks is a form of overfitting to evaluation, not just training rows.