What is an AI hallucination?

Q: What is an AI hallucination?

Published June 1, 2026 · 5 min read

Definition

An AI hallucination is when an AI confidently gives a fluent answer that is actually false or made up.

At a glance

It is built into how language models work, not a bug a future update will fix.
A made-up answer looks identical to a correct one — same tone, often with fake citations, dates, or numbers.
Rates spike on hard, specific questions: general models hallucinated on 58-82% of legal queries.
You can shrink the rate, but never reach zero — plan for residual error.

Why it happens

A model does not look up facts. It predicts the next plausible-sounding word, so when it hits a gap it still produces a smooth, confident answer with no sense of “I don’t know.” OpenAI researchers showed this is baked in: models are graded like test-takers who score better by guessing than by admitting uncertainty, so they learn to bluff^[1]^[2].

What it costs

Even purpose-built legal AI tools got answers wrong 17-34% of the time^[3]^[5]. In Mata v. Avianca, two lawyers were sanctioned for filing a brief citing cases ChatGPT had invented^[4]. Match the use case to the stakes: drafting and brainstorming are low-risk; customer answers, legal or medical claims, and numbers feeding decisions need a human checking the output first.

How to manage it

Ground the model in your own documents (retrieval), narrow the task, ask for clickable sources, and run regular evals. Above all, keep a person in the loop wherever an error is expensive. Treat any “zero hallucination” promise as a red flag.

Bottom line

Hallucinations are structural, not a defect waiting to be patched — lower the rate with grounding and tight scope, and keep a human on anything consequential.

References

Why Language Models Hallucinate — Adam Tauman Kalai, Ofir Nachum, Santosh Vempala, Edwin Zhang. OpenAI / arXiv arxiv.org
Why language models hallucinate. OpenAI openai.com
Hallucinating Law: Legal Mistakes with Large Language Models are Pervasive — Matthew Dahl, Varun Magesh, Mirac Suzgun, Daniel E. Ho. Stanford Law School / RegLab law.stanford.edu
Mata v. Avianca, Inc. Wikipedia en.wikipedia.org
Stanford Study Finds High Percentage of Errors Using Large Language Models in Legal Contexts. Foley & Lardner LLP www.foley.com

Comments

Questions, corrections, and links welcome. Be specific and civil.

Loading comments…