Sapiens
Technicals

What are scaling laws?

Published June 1, 2026 · 5 min read

SCALING LAWSSteep at first, then it flattens.Near the top, each step up costs far more for far less.AI skillspendsize + data + computebig gains, cheapplateau:diminishing returnsMore size, data and compute keeps helping — but the payoff per dollar shrinks as you climb.

Definition

An AI model gets predictably better as you increase three things: its size, its training data, and the computing power used to build it.

At a glance

  • Three levers: model size, training data, and compute. Turn all three up in balance and skill reliably improves[1].
  • It follows a power law: early spend buys big gains, then the curve flattens into diminishing returns[4].
  • Because it is predictable, labs can forecast a model’s quality before paying to build it[3].
  • Doubling spend does not double quality.

How it works

Increasing size, data, and compute together raises performance in a steady, measurable way that holds across a huge range of model sizes - so results can be estimated in advance.

Why bigger is not always better

After a point, each extra dollar buys a smaller gain than the last. DeepMind’s 2022 Chinchilla study proved it: a 70B model trained on more data beat a 280B one on the same budget[2]. The rule of thumb - about 20 words of data per parameter.

Bottom line

Don’t ask “how big can we go?” Ask “what is the cheapest model, with the best data, that does the job?”

References

  1. Scaling Laws for Neural Language Models — Jared Kaplan, Sam McCandlish. OpenAI arxiv.org
  2. An empirical analysis of compute-optimal large language model training — Jordan Hoffmann. Google DeepMind deepmind.google
  3. Neural scaling law. Wikipedia en.wikipedia.org
  4. LLM Scaling Laws Explained - Will Bigger AI Models Always Win. BuildFastWithAI www.buildfastwithai.com

Comments

Questions, corrections, and links welcome. Be specific and civil.

  • Loading comments…