Published June 2, 2026 · 4 min read

What is overfitting?

Definition

Overfitting is when an AI model learns its training data too well, memorizing quirks and noise instead of general rules, so it performs great on old examples but poorly on new ones.^[2]

At a glance

Great scores on training data plus weak scores on new data is the classic warning sign.^[1]
Caused by models that are too complex or trained too long on too little (or noisy) data.^[4]
The risk is real-world: a model that looks accurate in testing fails on actual customers.^[3]
Detected by comparing performance on practice data versus fresh, held-back data.

Why it matters to your business

An overfit model can dazzle in a demo, then make bad calls on real customers, fraud, or forecasts it has never seen. Because it memorized noise instead of true patterns, its accuracy collapses outside the lab^[1]. Always ask a vendor how the model scored on fresh, unseen data, not just training data.

How teams guard against it

Engineers hold back some data the model never trains on, then check accuracy there. They also simplify the model, gather more varied data, and stop training before memorization sets in. If training accuracy is high but test accuracy is low, that gap is the tell^[3].

Bottom line

Overfitting means an AI aced the practice test by memorizing the answers, so judge any model by how it does on new data it has never seen.

References

What is Overfitting? - Overfitting in Machine Learning Explained. Amazon Web Services aws.amazon.com
What is Overfitting? IBM www.ibm.com
Overfitting. Google for Developers developers.google.com
Overfitting. Wikipedia en.wikipedia.org

Comments

Questions, corrections, and links welcome. Be specific and civil.

Loading comments…