Is a multi-agent system better than a single agent?

Usually not. More agents add coordination, not intelligence, and coordination is where these systems fail. A single capable agent with good tools handles most tasks and is far easier to debug, govern and afford. Multi-agent only earns its keep for genuinely parallel or genuinely specialised work — and it can cost roughly fifteen times the tokens of a single chat.

What does “agent autonomy” mean?

How much an agent does without checking with a human — a dial from “Operator” (you decide everything) through to “Observer” (it runs on its own), with stages like Approver (it acts but pauses on consequential moves) in between. Autonomy is independent of capability: a very capable agent can still be kept on a short leash.

What’s the difference between a chatbot, a copilot and an agent?

A chatbot converses but doesn’t act. A copilot/assistant helps a human who stays in charge — every output passes through a person before it becomes an action. A true agent is goal-directed: it plans, uses tools and acts in a loop on its own, pausing for a human only at set checkpoints. “Agentic” is a spectrum, and much that’s sold as an “agent” is really a copilot.

How much autonomy should we give an AI agent?

As little as does the job. Set the dial deliberately and keep human-in-the-loop checkpoints on anything consequential or hard to reverse — spending, changing records, sending messages. Resist removing those checkpoints just because the agent has been reliable lately; the cost of a confident mistake on an irreversible action is what they exist to catch.

One Agent or Many — and How Much Leash

The moment a team gets one agent working, two temptations appear. The first is to add more agents — a planner, a researcher, a critic, a whole org chart of them — on the theory that more must be smarter. The second is to take the human out of the loop and let it run. Both are dials you can turn, and both have a sensible default that is more modest than the hype suggests. Getting these two settings right is most of what separates an agent that helps from one that quietly causes trouble.

Two dials: how many agents, and how much leash.

Default to one

Start with a single agent and add more only when you can name what the extra ones buy you. More agents do not add intelligence; they add communication — and communication between agents is where these systems break. A single capable agent with a good set of tools handles most tasks, and it has a decisive practical advantage: one line of reasoning to follow, one identity to govern, one place for things to go wrong. When you fan a task out across many agents, you also have to coordinate, secure and debug all of them. The burden of proof sits with “many,” not with “one.”

When many earns its keep

There are two situations where coordination is genuinely worth the cost. One is real parallelism — independent sub-tasks that can run at the same time, like searching ten sources at once, where wall-clock time matters and the pieces don’t depend on each other. The other is real specialisation — skills, contexts or permissions so different that one agent shouldn’t hold them all. Outside those, multi-agent usually adds failure modes without adding value, and the price is steep: a multi-agent setup can burn on the order of fifteen times the tokens of a single chat, small errors compound as they pass between agents, and the whole thing is harder to debug because no single trace tells you what happened. (The engineering detail of when coordination helps lives in our piece on one agent or many — the rule of thumb here is simply: reach for many only when you can name the parallelism or the isolation it buys.)

How much leash: autonomy is a dial

The second dial is autonomy — how much the agent does without checking with you — and the key idea is that it is independent of how capable the agent is. A very capable agent can still be kept on a short leash, asking permission before each consequential step. A useful way to picture the range is the way we already think about self-driving cars, from fully hands-on to fully autonomous, with several stages in between:

Setting	What the agent does	The human’s role
Operator	Nothing without you — you drive	You make every decision
Collaborator	Suggests and drafts; you act	You decide and execute
Consultant	Proposes a plan and waits	You approve before it acts
Approver	Acts, but pauses on the big moves	You sign off on consequential / irreversible steps
Observer	Runs on its own	You monitor and can step in

Autonomy is a setting you choose per use, not a property of the model. Most valuable agents sit in the middle, not at “Observer”.

The mechanism that keeps the dial where you want it is the human-in-the-loop checkpoint: the agent pauses before anything consequential or hard to reverse — spending money, changing a record, sending a message — and a person approves, edits, or rejects. The discipline is to put those checkpoints on exactly the actions you couldn’t comfortably undo, and to resist the pull to remove them just because the agent has been reliable lately.

Autonomy is a setting you choose — Operator to Observer.

Where chatbot, copilot and agent sit

This dial also explains the familiar product words. A chatbot simply converses — it answers, it doesn’t act. A copilot or assistant helps a human who stays firmly in charge: every output passes through a person before it becomes an action — low autonomy by design. A true agent is goal-directed: it plans, uses tools, runs in a loop, and takes action on its own, pausing for a human only at the checkpoints you set. “Agentic” is therefore a spectrum, not a badge — and a lot of what is sold as an “agent” is really a copilot, which is often exactly what you want. (Telling the genuine article from the relabelled one is its own skill — see agent washing.)

The questions to ask aren’t “how many agents?” and “how autonomous?” — they’re “what’s the fewest agents that do this?” and “what’s the least it can do on its own and still be useful?” Start low on both dials and turn them up only against a reason you can name.
— Sanjeev Purohit, from our delivery work

So what

Both dials have the same governing principle, and it is the through-line of this whole field: use the least that does the job. The fewest agents, the least autonomy, the humans kept where the consequences are. That is not timidity; it is how you get a system you can debug, afford, and defend. The impressive-looking choice — a swarm of fully autonomous agents — is usually the expensive, fragile one. The boring choice is usually the right one.

Frequently asked

Is a multi-agent system better than a single agent?: Usually not. More agents add coordination, not intelligence, and coordination is where these systems fail. A single capable agent with good tools handles most tasks and is far easier to debug, govern and afford. Multi-agent only earns its keep for genuinely parallel or genuinely specialised work — and it can cost roughly fifteen times the tokens of a single chat.
What does “agent autonomy” mean?: How much an agent does without checking with a human — a dial from “Operator” (you decide everything) through to “Observer” (it runs on its own), with stages like Approver (it acts but pauses on consequential moves) in between. Autonomy is independent of capability: a very capable agent can still be kept on a short leash.
What’s the difference between a chatbot, a copilot and an agent?: A chatbot converses but doesn’t act. A copilot/assistant helps a human who stays in charge — every output passes through a person before it becomes an action. A true agent is goal-directed: it plans, uses tools and acts in a loop on its own, pausing for a human only at set checkpoints. “Agentic” is a spectrum, and much that’s sold as an “agent” is really a copilot.
How much autonomy should we give an AI agent?: As little as does the job. Set the dial deliberately and keep human-in-the-loop checkpoints on anything consequential or hard to reverse — spending, changing records, sending messages. Resist removing those checkpoints just because the agent has been reliable lately; the cost of a confident mistake on an irreversible action is what they exist to catch.