Sapiens
Technicals

What is a large language model?

Published June 1, 2026 · 4 min read

A LARGE LANGUAGE MODEL It just guesses the next word. Score every candidate, write down the likeliest, repeat. TEXT SO FAR The capital of France is… Paris most likely Lyon Berlin cheese write it down

Definition

A large language model is software trained on huge amounts of text to predict the next word, which lets it generate human-like writing, answers, and code.

At a glance

  • It does one thing: guess the next word, over and over. Everything it “knows” is a side effect of doing that well across trillions of words[4].
  • It is a prediction engine, not a fact database. Confident, fluent, wrong answers (hallucination) are permanent, not a bug to be patched.
  • Scale made it useful: billions of parameters trained on internet-scale text[3]. But bigger is not always better for your job.
  • You rent a hosted model and pay per “token” (about 3/4 of a word) for text in and out. You almost never train one yourself.

How it works

Given “The capital of France is”, the model scores candidate words and writes the likeliest, “Paris”, then repeats[4]. To get good at this across the whole internet, it must absorb grammar, facts, styles, and code[1]. The fluency in ChatGPT or Claude is that single trick done extremely well[2].

Why it sounds certain when wrong

It picks the most plausible-sounding words, with no internal sense of true or false, so it states fabrications in the same confident tone as facts. The fix is how you use it: feed it your trusted documents at question time (retrieval) and keep a human reviewing anything high-stakes.

What it means for buying

You are renting a general prediction engine billed per token. At scale, model size and caching can swing the bill enormously. Training your own from scratch costs tens of millions and needs research teams[4]; nearly every business should instead use a hosted model and compete through its data and safeguards[3].

Bottom line

An LLM is a next-word predictor that scaled into a brilliant, fast, confidently fallible assistant — rent one, ground it in your data, and put guardrails around it.

References

  1. What Are Large Language Models (LLMs)? IBM www.ibm.com
  2. Transformers, the tech behind LLMs (Deep Learning Chapter 5) — Grant Sanderson. 3Blue1Brown www.3blue1brown.com
  3. Reflections on Foundation Models. Stanford Center for Research on Foundation Models (CRFM) crfm.stanford.edu
  4. Language Models are Few-Shot Learners (GPT-3) — Tom B. Brown, Benjamin Mann, Nick Ryder, et al.. arXiv arxiv.org
  5. King - Man + Woman = Queen: The Marvelous Mathematics of Computational Linguistics. MIT Technology Review www.technologyreview.com

Comments

Questions, corrections, and links welcome. Be specific and civil.

  • Loading comments…