The architecture diagram had eleven agents on it — a planner, a researcher, a critic, a coder, a tester, and more, each with a confident label and an arrow. It looked like an org chart for software. In production it behaved like one too: messages lost between roles, a small error in the planner amplifying through everything downstream, and a debugging session that meant reading eleven transcripts to find where a single handoff had gone wrong. The same task, six months later, ran on one well-equipped agent with good tools. It was faster, cheaper, and you could actually tell what it had done.
The default should be one
Start with a single agent and add more only when you can name what coordination buys you. More agents do not add intelligence; they add communication, and communication is where multi-agent systems fail. A capable agent with a good set of tools handles most tasks, and has a decisive operational advantage: one reasoning trace to read, one identity to govern, one place for things to go wrong. The burden of proof is on multi-agent, not on the simple case.
This runs against the fashion. Multi-agent “swarms” and elaborate role hierarchies are the visible trend of 2026, and analysts report a surge of enterprise interest. But interest is not justification, and the elaborate diagram is often a cost dressed as sophistication.
What multi-agent actually buys — and what it costs
Coordination is worth paying for in two situations, and not many others.
- Genuine parallelism: independent subtasks that can run at the same time — searching ten sources at once, evaluating several options concurrently — where wall-clock time matters and the subtasks do not depend on each other.
- Genuine specialisation: distinct skills or contexts that do not fit one agent — a separate context window for a large isolated subtask, or tool sets and permissions so different that combining them would over-privilege a single actor.
Anthropic’s own work on a multi-agent research system makes the trade honestly: parallel sub-agents helped on broad, breadth-first search — and cost several times the tokens of a single-agent chat, while introducing coordination complexity that needed real engineering to control. The lesson is not “multi-agent is good” or “bad”; it is that coordination is a cost you take on deliberately, for parallelism or isolation you actually need.
| Use one agent when… | Consider many when… |
|---|---|
| The work is sequential and tightly coupled | Subtasks are independent and can run in parallel |
| One context and tool set covers the task | Skills, contexts or permissions are genuinely distinct |
| You need to debug and govern it easily | Parallel wall-clock time outweighs the coordination cost |
| Cost and predictability matter most | Isolating a large subtask protects the main context |
Why multi-agent fails: it is the contracts, not the agents
When multi-agent systems go wrong, the cause is rarely that an individual agent reasoned badly. It is the seams between them — the same lesson we have learned everywhere else in agentic delivery, that the integration boundary is where things break.
- Ambiguous handoffs: one agent’s output is another’s input, and if that contract is loose, meaning is lost in translation between roles.
- Compounding errors: a small mistake early in a chain is amplified by every agent downstream that trusts it.
- Lost context: each handoff is a chance to drop the why, leaving later agents to act confidently on a partial picture.
- Opaque debugging: more agents means more transcripts to reconstruct, and no single trace of what happened.
So if you do go multi-agent, the engineering effort goes into the coordination layer: explicit, versioned contracts between agents (the event-contract discipline applies directly), shared and governed context, and observability across the whole system rather than per agent. Build the seams first.
Reach for a second agent when you can name the parallelism or the isolation it buys you — not because the diagram looks more capable with more boxes on it. Every agent you add is another contract to get right, and the contracts are where these systems actually fail.
How to decide
Make multi-agent earn its place. Start from one and add agents only against a specific, named benefit.
- Can a single agent with the right tools do this? If yes, stop there.
- Are there independent subtasks that genuinely run in parallel, where wall-clock time matters? That is a reason for many.
- Are there skills, contexts or permissions so distinct that one agent should not hold them all? That is a reason for many.
- If you add agents, can you specify and version the contracts between them, and observe the whole system? If not, you are not ready for multi-agent.
The number of agents is an architecture decision, not a capability score. The best agentic systems tend to have the fewest moving parts that solve the problem — because every part you add, you also have to coordinate, govern and debug. When in doubt, one good agent.
Frequently asked
- Is multi-agent better than a single agent?
- Usually not. More agents add coordination, not intelligence, and coordination is where multi-agent systems fail. A single capable agent with good tools handles most tasks and is far easier to debug, govern and cost. Multi-agent earns its keep only for genuinely parallel or genuinely specialised work.
- When should you use a multi-agent architecture?
- In two cases: genuine parallelism (independent subtasks that can run at once, where wall-clock time matters) and genuine specialisation (skills, contexts or permissions so distinct they should not sit in one agent). For sequential, tightly-coupled work, a single agent is usually faster, cheaper and more reliable.
- Why do multi-agent systems fail?
- Because of the contracts between agents, not the agents themselves. The common failures are ambiguous handoffs, errors that compound down a chain, lost context at each handoff, and opaque debugging across many transcripts. If you go multi-agent, the engineering effort belongs in the coordination layer — explicit versioned contracts, shared governed context, and system-wide observability.
- Does multi-agent cost more?
- Yes, typically several times more tokens than a single-agent approach, plus the engineering cost of coordination. Anthropic’s own multi-agent research work reported multi-agent setups consuming many times the tokens of a single-agent chat. Pay that premium only when parallelism or isolation genuinely justifies it.