Skip to content

Field Guide · Part 2 · 8 min read

Not Everything Is an LLM: The Model Family

An LLM is one species in a much larger family tree. Most enterprise AI isn’t even generative. A plain map of the model landscape — and why the useful question is rarely “which LLM?” but “what kind of model do I actually need?”

Built from

  • Field experience
  • Independent research
  • Data-backed
  • Reviewed with field experience

Last substantively reviewed · 2026-06-27

When a single product — ChatGPT — becomes the public face of an entire field, a strange thing happens: people start using one word for everything. Every model becomes “an AI”, and every AI becomes “an LLM”. It is an understandable shorthand and a genuinely costly one, because it hides most of the landscape. An LLM is one species in a much larger family tree, and a great deal of the AI that quietly runs the modern economy is not a language model at all — and is not even generative. Here is the map.

An LLM is the innermost ring — one of many model types.

One mental model: input, output, and which job

The clearest way to hold the whole landscape is three nested rings and one question. The rings: artificial intelligence is the whole field; inside it sits machine learning (systems that learn from data); inside that, deep learning (the neural-network kind); and inside that, generative and foundation models — of which LLMs are the innermost, smallest ring. The question that then sorts almost everything: what goes in and what comes out — text, an image, audio, a vector, a number — and are you generating something new or predicting/deciding something about what already exists? Most of the confusion in AI conversations dissolves once those two things are named.

The language family (and it is a family)

Even within language, “LLM” isn’t one thing. There are large language models (the broad, capable, expensive ones behind chat assistants); small language models (compact enough to run on a phone or a laptop, trading some breadth for privacy, speed and cost); and reasoning models, which are trained to spend extra computation “thinking” before they answer, which helps on genuinely hard, multi-step problems and is wasted — sometimes worse than wasted — on simple ones. And the same trained model can appear as a raw “base” version that merely continues text, or as the “instruct/chat” version finished to follow requests. ChatGPT is the latter: a base model taught to behave like an assistant, not a different creature.

Generative, beyond text

Generation isn’t limited to words. Vision-language models take an image (plus text) and describe or reason about it — usefully thought of as a language model with an eye bolted on. Image-generation models (the “diffusion” family behind the well-known art tools) start from noise and refine it into a picture — a completely different machine from next-word prediction. Speech splits in two: text-to-speech (synthesising a voice) and speech-to-text (transcribing one). These are the categories people now half-recognise — but notice they are already four different kinds of model, not four flavours of LLM.

The retrieval backbone: embeddings (a different kind of model)

Here is the category most people have never heard of and most enterprise AI quietly depends on. An embedding model turns a piece of text (or an image) into a vector — a list of numbers that act like coordinates for meaning, so that things which mean similar things sit close together. That is what makes semantic search and “chat with your documents” work: you find the relevant material by proximity in meaning, then hand it to an LLM. The key point is that an embedding model does not output text at all — it outputs coordinates. It does not write; it measures. (A companion model, a “reranker”, then re-orders the shortlist for relevance.) The LLM gets the credit, but the answer is mostly decided before it runs, by whether retrieval surfaced the right context.

The forgotten majority: classical, predictive ML

Then there is the largest category of all, which predates the current boom by decades and still does the most work: classical predictive machine learning. It is not generative — it decides. A credit-risk score, a churn prediction, a demand forecast, a fraud flag, a product recommendation, a defect spotted on a production line: these are regressions, classifiers, gradient-boosted trees, recommendation and forecasting models. For ordinary tabular data — the rows and columns most businesses actually run on — these models remain the reliable default, and they are cheaper, faster, and far easier to audit than an LLM. Reaching for a language model here is like using a film studio to take a passport photo.

If you need to…The right kind of modelNot an LLM because…
Draft, summarise, extract from textLarge / small language model(this is the LLM’s job)
Search by meaning / “chat with docs”Embedding model (+ reranker)It outputs vectors, not text
Describe or read an imageVision-language modelIt takes pixels in, not just text
Create an imageDiffusion / image modelIt denoises pixels, not predicts words
Transcribe or synthesise speechSpeech-to-text / text-to-speechAudio in or out, not text generation
Score risk, forecast, recommend, detect fraudClassical / predictive MLIt predicts a number or class — cheaper, auditable, deterministic
Start from the input, the output, and the job. The model type follows — and it is often not a language model.
Most deployed enterprise AI is predictive, not generative.
The first question is almost never “which model?” It’s “what kind of model?” Input, output, generate-or-predict. Get that right and half the architecture decides itself; get it wrong and no amount of prompt-tuning will save you, because you picked the wrong species for the job.
Sanjeev Purohit, from our delivery work

So what

Two things follow from the map. First, most serious AI systems are not one model but several, each doing the job it is suited to — an embedding model to find context, a language model to write, a classifier to route, perhaps a forecast underneath. A “foundation-model strategy” is not “pick an LLM.” Second, and more useful for a leader: when someone proposes an AI project, the sharpest early question is not which provider or which LLM, but what kind of model the problem actually calls for. Often the honest answer is “not a language model at all” — and recognising that is what separates AI that works from AI that merely demos.

Frequently asked

Is all AI an LLM?
No. An LLM is one type of model — the innermost ring of AI ⊃ machine learning ⊃ deep learning ⊃ generative/foundation models ⊃ LLMs. Whole categories aren’t language models: embedding models, image and speech models, and the classical predictive ML (scoring, forecasting, recommendation, fraud detection) that still runs most enterprises and isn’t even generative.
What are the main types of AI models?
Grouped by what they do: language models (large, small, and reasoning variants); generative-beyond-text (vision-language, image/diffusion, speech); embedding models that turn meaning into vectors for search and retrieval; and classical/predictive ML (regression, classification, gradient-boosted trees, recommendation, forecasting, anomaly detection, computer vision). The useful way to choose is by input, output, and whether you’re generating or predicting.
How is an embedding model different from an LLM?
An embedding model outputs a vector — coordinates for meaning — not text. It’s used to find relevant content by similarity (the backbone of semantic search and retrieval), usually small, fast and cheap. An LLM generates text. Most “chat with your documents” systems use both: an embedding model to retrieve, an LLM to answer.
When should I NOT use an LLM?
When the job is a prediction or decision over structured data — risk scoring, forecasting, recommendation, fraud detection, classifying records. Classical predictive models are cheaper, faster, deterministic and auditable for these, and usually more accurate. Reach for an LLM when the job is genuinely about generating or understanding language or other unstructured content.

About the author

Priyanka Pandey

Founder & Editorial Lead

Priyanka Pandey founded Ivaaya and leads its editorial voice, translating real delivery experience into practical thinking on AI-native engineering, decision-making and technology leadership. Her work focuses on helping senior leaders make sense of the changes reshaping software delivery without adding to the noise.

Reviewed and challenged by

Sanjeev Purohit

Principal, Decision Architecture

Sanjeev works across enterprise architecture, product strategy and AI-native delivery. The ideas in this article have been challenged against real programmes, production systems and organisational decision-making before publication.

Compare notes

If this describes something you are seeing in your team, we would be happy to compare notes — what is happening, where it is getting stuck, and what you are trying to change. No pitch; just a useful conversation.

Share what you’re seeing