Definition

An AI model’s ability to read and reason over a huge amount of text at once, like a full contract or a year of emails, without losing track of earlier parts.

At a glance

The context window is the AI’s working memory, measured in tokens; about 1 million tokens holds roughly 750,000 words, or 2,500 to 3,000 pages.
Today’s leaders: GPT-class models near 128,000 tokens, Claude up to 1 million, Gemini up to about 2 million.
A bigger window lets the AI analyze whole documents at once, with more coherent answers and fewer made-up facts.
Bigger is not always better: models can get “lost in the middle,” nailing the start and end but missing details in the center.

How it works

Picture the AI as a reader with a fixed-size desk. Everything it sees at once, your question, pasted documents, and its own replies, must fit on that desk^[1]. Long context means the desk is wide enough to lay out a whole 300-page contract and reason across it. Text is counted in tokens; about 750,000 words fit in 1 million^[2].

Why it matters

Ask one question against a full document set, summarize a long report, compare clauses, or search a whole knowledge base in a single pass. Common uses: reviewing legal agreements, analyzing financial filings, answering questions from long manuals, and digesting meeting transcripts^[4].

The catch

Even when a document fits, the AI does not weigh every part equally. The “lost in the middle” effect shows a U-shape: accuracy stays high at the start and end but can drop over 30 percent for facts in the middle^[3]. More context also costs more per query, so feed it the most relevant material, not everything.

Bottom line

Long context lets an AI reason across a whole document, a real advantage, but keep key facts near the start or end and verify the details.

What is long-context understanding?

At a glance

How it works

Why it matters

The catch

Bottom line

References