What is an AI Hallucination?
An AI hallucination is when a language model produces confident, fluent output that is factually wrong, fabricated, or unsupported by its source material. Hallucinations are not glitches or bugs — they are a structural property of how generative models work, which means they cannot be eliminated, only managed.
Why hallucinations happen
Language models are trained to predict the most plausible next token given the preceding context. They are optimized to produce fluent, contextually-appropriate text — not to be calibrated on truth. When the model lacks specific knowledge to answer correctly, it doesn't know it lacks the knowledge; it generates the most plausible-looking continuation, which often reads as confidently asserted but factually incorrect.
Three structural drivers:
- No epistemic boundary — the model doesn't have a reliable signal for "I don't know this." Plausibility and accuracy are different things, and the model is trained on plausibility.
- Compression of training data — the model approximates billions of training documents in a fixed-size weight matrix. Specific facts can be partially recovered, partially blurred, partially invented.
- Sampling randomness — even when the model "knows" the right answer, sampling temperature and top-k decisions can pull it toward plausible-but-wrong tokens.
Hallucination categories
- Factual — wrong dates, fabricated quotes, made-up citations to real authors, invented statistics
- Source — citing real-sounding URLs that don't exist, attributing real claims to wrong people, hallucinating bibliography entries
- Logical / mathematical — confident wrong arithmetic, incorrect proofs, misapplied formulas
- Persona / capability — the model claims abilities it doesn't have ("I just searched the web" when it didn't, "I sent the email" when no tool was called)
- Code / API — fabricated function signatures, hallucinated library imports, made-up CLI flags
Why hallucinations matter for security
Hallucinations are not just an accuracy problem — they're a security and trust problem:
- Operational decisions made on hallucinated context — a deployment that summarizes incident reports or recommends actions will eventually recommend something that wasn't supported by the source material
- Hallucinated tool calls — agents may invoke tools with fabricated arguments, executing actions the user didn't request
- Compliance and audit risk — regulated industries (healthcare, legal, finance) carry direct liability when AI outputs are stated as fact and turn out wrong
- Hallucinated security findings — AI security tools that hallucinate vulnerabilities or miss real ones are net-negative
Reducing hallucinations
You cannot eliminate them. You can reduce them:
- Retrieval-Augmented Generation — ground answers in retrieved source material; instruct the model to cite sources
- Tool calls for facts that need to be exact — calculator, database, search rather than generation for numbers and lookups
- Lower temperature — for accuracy-critical tasks, sample at temperature 0
- Output validation — runtime guardrails that compare generated claims to grounding sources
- Calibration training — newer models are trained to express uncertainty when it exists; teach the model to say "I don't know" rather than confabulate
- Human review for high-stakes outputs — accept that for some use cases, generated content needs a human check before action