A team we know upgraded to the largest, most capable model available, expecting their AI assistant to get noticeably better. It got slightly worse. The reason had nothing to do with the model: they were feeding it a sprawl of loosely-related documents and hoping it would find the needle. A smaller model with the right three paragraphs in front of it would have beaten the giant every time. This is the most useful and least understood lever in applied AI: the answer is mostly decided not by how clever the model is, but by what you put in front of it.
A model only knows what’s in front of it
A model has two sources of knowledge: what it absorbed in training (frozen, general, and with no awareness of your business) and what you put in its “context window” — its working memory for this particular request. It has no live access to your documents, your database, or today’s news unless you supply it, in that window, at the moment you ask. So there’s a useful distinction hiding under the word “prompt.” The prompt is what you ask. The context is everything the model can see while answering — your instructions, examples, the facts you’ve retrieved, the tools you’ve made available. Most people tune the first and ignore the second, which is backwards: the context is the bigger lever.
Why “bigger” isn’t the fix
The two reflexes when an answer disappoints are “use a bigger model” and “use a bigger context window so I can paste in more.” Both miss the point. If the relevant fact isn’t in front of the model, no amount of model size invents it reliably — it fills the gap with something plausible instead. And a bigger window is not a free win: models attend unevenly to long inputs (the well-documented “lost in the middle” effect), so burying the key paragraph in fifty pages of context can make the answer worse, not better. More context is not better context. The skill is getting the right material in front of the model — current, relevant, trustworthy — not getting more of it in.
When the answer is wrong, what to actually reach for
Because the model is mostly a function of its context, the fix for a bad answer is usually upstream of the model. A quick map of what each lever actually changes:
| When the answer is… | Reach for… | Because it fixes… |
|---|---|---|
| Misunderstanding the task | A clearer prompt + an example or two | Ambiguity about what you actually want |
| Missing your facts / out of date | Retrieval (RAG): put the right documents in context | The model can’t know what you didn’t show it |
| Wrong on maths / live data / actions | A tool (calculator, search, database) | Things the model shouldn’t be guessing at all |
| Consistently off in style or format | Fine-tuning (occasionally) — or just better examples | A persistent behaviour, not a one-off gap |
| Still wrong with all of the above | A more capable model | A genuine reasoning ceiling — the last resort, not the first |
When someone says “the AI isn’t good enough,” the first question is almost never “which model?” It’s “what did you put in front of it?” Nine times out of ten the fix is better context, not a bigger brain — and the tenth time you’ll know, because you’ll have ruled the other nine out.
So what
This reframes where to spend your effort and your money. Chasing the largest model is the expensive, low-yield move; investing in context — the retrieval that finds the right material, the instructions that frame the task, the discipline that keeps it all current — is where the real gains are. Done seriously, this becomes its own discipline: context engineering, the practice of owning and curating what the model sees. But the foundational idea is simple enough to carry into any AI conversation: before you ask for a bigger model, ask what you put in front of the one you have.
Frequently asked
- Does a bigger AI model always give better answers?
- No. A model only knows what it absorbed in training plus what you put in its context window. If the relevant facts aren’t in front of it, a bigger model just invents something plausible. Often a smaller model with the right context beats a larger one fed poorly — the context is usually the bigger lever.
- What’s the difference between a prompt and context?
- The prompt is what you ask. The context is everything the model can see while answering — your instructions, examples, retrieved facts and available tools. Most people tune the prompt and ignore the context, but the context is the larger lever on answer quality.
- Is a bigger context window better?
- Not by itself. Models attend unevenly to long inputs (the “lost in the middle” effect), so padding the window with loosely-related material can make answers worse. The goal is the right context — current, relevant, trustworthy — not more context.
- My AI gives wrong answers — should I switch to a better model?
- Usually not first. Most wrong answers are context problems: clarify the prompt and add examples; use retrieval (RAG) to supply your facts; give it a tool for maths, live data or actions. Reach for a more capable model only once you’ve ruled those out — it’s the last lever, not the first.