What are scaling laws?

Q: What are scaling laws?

Published June 1, 2026 · 5 min read

Definition

An AI model gets predictably better as you increase three things: its size, its training data, and the computing power used to build it.

At a glance

Three levers: model size, training data, and compute. Turn all three up in balance and skill reliably improves^[1].
It follows a power law: early spend buys big gains, then the curve flattens into diminishing returns^[4].
Because it is predictable, labs can forecast a model’s quality before paying to build it^[3].
Doubling spend does not double quality.

How it works

Increasing size, data, and compute together raises performance in a steady, measurable way that holds across a huge range of model sizes - so results can be estimated in advance.

Why bigger is not always better

After a point, each extra dollar buys a smaller gain than the last. DeepMind’s 2022 Chinchilla study proved it: a 70B model trained on more data beat a 280B one on the same budget^[2]. The rule of thumb - about 20 words of data per parameter.

Bottom line

Don’t ask “how big can we go?” Ask “what is the cheapest model, with the best data, that does the job?”

References

Scaling Laws for Neural Language Models — Jared Kaplan, Sam McCandlish. OpenAI arxiv.org
An empirical analysis of compute-optimal large language model training — Jordan Hoffmann. Google DeepMind deepmind.google
Neural scaling law. Wikipedia en.wikipedia.org
LLM Scaling Laws Explained - Will Bigger AI Models Always Win. BuildFastWithAI www.buildfastwithai.com

Comments

Questions, corrections, and links welcome. Be specific and civil.

Loading comments…