Definition

A skill an AI model lacks when small but suddenly performs well once it crosses a certain size — in a way the smaller versions could not predict.

At a glance

Defining trait: unpredictability. Performance stays near-random, then jumps sharply once the model is big enough.
Common examples: multi-step arithmetic, step-by-step reasoning, and learning a task from a few prompt examples.
More data, parameters, and compute are what tend to unlock these behaviors.
Some “emergence” may be a measurement illusion, not a real leap.

How it works

Train ever-larger versions of one model. On some tasks the small and medium ones do no better than chance, then the largest suddenly succeeds. A 2022 Google paper led by Jason Wei named this pattern emergence^[1], with typical examples like arithmetic, following instructions, and few-shot learning^[3].

The mirage debate

A 2023 Stanford study argued many jumps are an artifact of all-or-nothing scoring that penalizes smaller models^[2]. Under smoother metrics, the leaps often became gradual and predictable^[4]. So some shifts are real; others are just how progress is measured.

Why it matters

A newer, larger model may unlock skills its predecessor lacked. Since these gains are hard to forecast, test each model on your actual use case rather than assuming what it can or cannot do.

Bottom line

Emergent capabilities are real but unpredictable — test each model on your own tasks instead of guessing.

What are emergent capabilities?

At a glance

How it works

The mirage debate

Why it matters

Bottom line

References