Ask most people what ChatGPT is and you will hear some version of “it’s an AI that knows things.” It is worth slowing down on that sentence, because almost every word in it is misleading — and the misunderstanding is the root of both the over-trust and the disappointment that follow. The thing in the box does not know things. It is not a mind, and it is not a database. It is something simpler, stranger, and far more useful once you see it clearly: a function that predicts.
A model is a function, not a mind
Strip away the mystique and a model is a mathematical function: you give it an input, it returns the most likely output, and the particular pattern it follows was learned from a vast amount of data rather than written by a programmer. Concretely, a trained model is little more than a very large list of numbers — the “weights” — plus a small program that runs them. You could, in principle, put a model on a memory stick. Training is the slow, expensive process of tuning those numbers; using the model (“inference”) is just running the tuned function, which is cheap and fast. Nothing in there is reasoning the way a person does. It is pattern, captured as arithmetic.
How a language model works, in one idea
A large language model does exactly one thing: given some text, it predicts the next word (more precisely, the next “token” — a word or fragment). It assigns a probability to every possible next token and picks one, then repeats, feeding its own output back in. That is the whole engine. The closest everyday comparison is the autocomplete on your phone — but scaled up enormously and trained on a large fraction of the written internet, so that “the most likely next word” becomes coherent paragraphs, working code, and plausible argument. The caveat matters as much as the comparison: it is choosing what a good answer looks like, statistically, not deciding what is true.
Why it is confident, fluent, and sometimes wrong
Once you see the model as a next-word predictor, its most talked-about flaw stops being mysterious. A useful image, from the writer Ted Chiang, is that a language model is like a blurry JPEG of the web: it has compressed an enormous amount of text into a fixed set of numbers, and when you ask it something it reconstructs a plausible answer from that compression rather than looking up an exact record. Usually the reconstruction is close enough to be right. Sometimes it is plausible and wrong — a confident, well-phrased invention. That is what “hallucination” is: not a bug bolted onto an otherwise-truthful system, but the same machinery that makes it fluent, working exactly as designed. The model has no separate place where it stores “facts it is sure about”; fluency and fabrication come out of the same process.
| What people assume | What it actually is |
|---|---|
| It knows facts | It predicts likely text; knowledge is lossy and reconstructed, not stored |
| It thinks / understands | It matches patterns statistically; no grounded understanding |
| It’s deterministic — same answer every time | It samples from probabilities; output varies |
| It can do the maths | It predicts what an answer looks like — give it a tool to actually compute |
| It’s a database you can trust | It’s a generator you must verify |
The moment a team stops asking “is the model right?” and starts asking “can we check what it gave us?”, the whole conversation gets healthier. You are not buying a source of truth. You are buying a very capable generator, and the value is in the system you build around it to catch the times it is confidently wrong.
Why this matters for how you use it
This is not pedantry; it changes what you do. Because the model predicts rather than knows, three habits follow. Give it the context it lacks — it has no live access to your data unless you provide it. Give it tools for the things it is bad at — a calculator for arithmetic, a search for current facts, a database for records — rather than trusting it to fake them. And verify anything that has to be true before it reaches a customer or a decision. None of this is exotic; it is just what you do once you have the right picture of the thing. An AI model is not smart in the way the word implies. It is a remarkably useful predictor — and knowing that is the first and most valuable piece of AI literacy there is.
Frequently asked
- What is an AI model, in plain terms?
- A mathematical function that turns an input into a most-likely output, where the pattern was learned from data rather than hand-written. A trained model is essentially a large list of numbers (its “weights”) plus a small program that runs them. A large language model is one kind, trained to predict the next word of text.
- Does an AI model actually understand or know things?
- No. It predicts likely text by matching patterns statistically; it has no grounded understanding and no reliable internal store of facts. Its knowledge is lossy and reconstructed, which is why it can be fluent and confident yet wrong.
- Why do AI models “hallucinate”?
- Because generating plausible text and inventing plausible-but-false text come from the same mechanism — predicting what a good answer looks like, not retrieving a verified one. Hallucination is the machinery working as designed, not a separate defect, which is why it can’t simply be switched off; you manage it with retrieval, tools and verification.
- If it isn’t “smart”, why is it so useful?
- Because an enormous amount of useful work is well-served by a fast, fluent predictor — drafting, summarising, extracting, translating, coding — provided you give it the right context and tools and verify what must be true. The skill is building the system around the model, not expecting the model to be an oracle.