The debate about AI and work is stuck at the wrong altitude. “Will agents take our jobs” is the question everyone asks, and it is the wrong unit. Automation has never operated on jobs. It operates on tasks — and it always has. The mistake is not new; what is new is how fast the task layer is being emptied, and how badly we are reading what that does to everything above it.
You can feel the distinction before you can name it. You hand an agent a piece of work and it comes back done — the endpoint written, the tests drafted, the migration scripted. And yet the part that actually mattered is untouched and now sits more heavily on you: deciding whether this should exist at all, whether it is right under the constraints you are answerable for, whether it belongs in the system you own. The task is complete. The work is not. That gap is the whole subject.
Three layers, not one
It helps to separate three things we usually collapse into the word “work”. A task is a unit of work: specifiable, bounded, with a definable done-state. It can be handed off and verified — which is precisely why agents are good at it. A role is the stance you occupy while orchestrating tasks: decomposer, reviewer, arbiter, owner. Roles carry judgement and accountability, they are not fully specifiable, and a person holds several of them at once. A job — a position — is the organisational wrapper: a named, accountable bundle of roles that a company hires for and holds responsible.
The dynamic that matters is directional. Automation eats upward from the bottom. Tasks go first; as they go, the role mix inside a job shifts — less doing, more deciding, reviewing, orchestrating; and only then, lagging behind, do jobs themselves get redrawn. Almost all the confusion in the current discourse comes from watching the task layer empty and concluding the job is disappearing, when what is really happening is that the job is being re-composed around the roles the tasks left behind.
The task layer is where automation actually bites
This is not a metaphor; it is the settled finding of two decades of labour economics. The task-based model — Autor, Levy and Murnane, then Acemoglu and Restrepo — shows that technology replaces labour in specific tasks, shifting the task content of production against workers, rather than eliminating whole occupations directly. Their estimate is that between 50 and 70 per cent of four decades of change in the US wage structure traces to task displacement, not job elimination. The same framework is symmetric: new tasks, where humans hold the advantage, reinstate labour. Work recomposes; it does not simply vanish.
Applied to generative AI, the lens holds. The canonical study decomposes a thousand occupations into nineteen thousand tasks and scores each against model capability: around 80 per cent of US workers have at least 10 per cent of their tasks exposed to LLMs, and the share of tasks materially affected rises to roughly half once you add the tooling and agents built on top of the models. Read the unit carefully — the technology acts on tasks, and its reach is amplified by the systems wrapped around it. (Honestly stated, this measures exposure and potential speed-up, not automation already realised; it is a map of where the water is rising, not a tide line.)
The work did not vanish. It moved up.
When tasks delegate, the work does not evaporate — it reconstitutes one layer up, as roles. The clearest evidence is in delivery data: the 2025 DORA research finds that time saved generating code is reallocated to verifying it, a “verification tax”, with reviewer cognitive load rising sharply as authoring speeds up. It also finds that AI behaves as an amplifier — magnifying an organisation’s existing strengths and weaknesses — and that the returns come from the surrounding system of platforms and practices, not the tool itself. In other words: automate the task and you do not remove the work, you promote it. The human moves from doing the task to holding the roles around it.
The roles we keep
So what are those roles? They are not new inventions — they are what is left of engineering once the typing is delegated, and most of them already have names in our practice:
- Decomposer — turning an outcome into tasks an agent can actually execute. This is the act we call Intent Translation, and it is the most senior thing in the building.
- Context designer — assembling the knowledge, constraints and examples an agent needs to act well rather than plausibly. Context engineering.
- Evaluation designer — defining what “correct” means before the work runs, so acceptance is a check and not a vibe. Evals as the spec.
- Reviewer and accountable acceptor — deciding what may enter the system, and owning that it did. The human acceptance loop.
- Arbiter — making the trade-offs no specification can pre-resolve. Decision architecture.
- Orchestrator — sequencing many streams of delegated work toward a single outcome. Delivery architecture.
- Accountable owner — holding the outcome rather than the output. The whole point of measuring governance against value.
Set side by side, the list is quietly significant: our framework is, in effect, already a taxonomy of the roles humans keep as tasks get delegated. The discipline is to recognise that you are now playing these roles rather than doing tasks — and to see the leverage in it. One person can hold the reviewer’s or orchestrator’s role across far more delegated work than they could ever have typed themselves. Fewer task-hours, more role-hours, each carrying more weight.
Vibe coding is a role confusion
This is what gives “vibe coding” a sharper definition than the usual hand-wringing. The moment an agent writes the code, your role silently changes — from author to reviewer-and-owner. Vibe coding is what happens when you keep acting like the author after that switch: skim, trust, ship, with the author’s easy confidence but none of the editor’s scrutiny. The danger is not that you delegated; it is that you took on the reviewer’s accountability while still doing the author’s once-over.
The academic literature gropes toward the same place under the label “responsibility gaps” — work whose authorship is machine and whose accountability no human has actually claimed. The remedy is not more code-review theatre. It is role clarity: knowing which role you are in, and doing that role’s real work. A reviewer who reviews like an author is the most expensive failure mode in agentic engineering, because it ships unowned decisions at machine speed.
The line keeps moving
None of this is static. The boundary between task and role is not fixed — it moves as agents improve. Writing unit tests once demanded judgement; much of it is now a delegable task. So the discipline is not drawing the line once and resting on it; it is redrawing it continually, and being honest about where it currently sits. A framework that names the human roles must therefore name them as a snapshot, not a fortress.
A caution against the comfortable version of this story. It is tempting to say judgement is the safe human preserve that AI merely assists — but the strongest economic reading runs the other way. Autor argues that AI can widen who gets to exercise expert judgement, pushing high-stakes decision-making down to more people, and he treats decisions themselves as tasks. So “judgement” is not a fixed human reserve either. The distinction that actually survives is narrower and harder: AI can help more people make a judgement; it cannot be the one answerable for it. Who may judge is widening. Who is accountable is not.
The accountable core
That gives the model its floor. Strip away every task an agent can execute and every role an agent can assist, and what remains is the accountable core: the irreducible answerability for the outcome. Delegation has a floor, and the floor is accountability. You can delegate the work; you cannot delegate being the one who is answerable for it — which is why “the AI did it” has never once been an acceptable answer to a customer, a regulator, or an incident review. This is where seniority now concentrates: not in producing the work, but in being the named human under it.
The rung we are sawing off
A real risk follows from all this, and it is the one I find most pressing as a leader. People earned the judgement that lets them hold the accountable core by doing the tasks — the very tasks now most delegable. Automate the bottom rung and you may quietly remove the ladder that produced the seniors in the first place. We could optimise the task layer so thoroughly that we stop manufacturing the people capable of owning anything above it.
I want to be honest about the evidence, because it is suggestive rather than settled. A large field study found early-career workers in the most AI-exposed occupations seeing notable relative declines in employment while experienced workers in the same fields held steady; a controlled trial of a coding assistant found that juniors gained the most raw output. But causation is genuinely contested — other economists attribute the junior dip to the sharpest interest-rate tightening in decades, not to AI — and crucially, no study has yet measured whether AI-assisted juniors build judgement faster or slower. So treat it as a risk to manage deliberately, not a law to cite. The mechanism is plausible and the stakes are high; that is enough reason to act, and not enough to overclaim.
What this changes for how you build and lead
For engineers, the core skill is shifting from task execution to role fluency: the ability to move cleanly between decomposer, context designer, reviewer, arbiter and owner — and, above all, to know which role you are in at any given moment. Most bad agentic work is a role error: doing the author’s job when you are the reviewer, or reaching for the reviewer’s checklist on something you should have decomposed differently in the first place. Naming the role you are in is half the discipline.
For leaders, the instruction is to stop measuring and staffing as if the task layer were still the work. Redesign jobs around the roles people now actually hold. Measure the accountable core — outcomes owned — rather than output shipped; it is the same argument as counting cognitive load removed rather than features delivered. And engineer, on purpose, the apprenticeship that the old task-ladder used to provide for free, because it will not provide it any longer.
The through-line is simple to state and hard to live: the unit of human value is moving from task to role. Agents taking the tasks is not the threat — that is just the task layer doing what task layers have always done under automation. The threat is letting task-abundance hollow out the roles until no one is genuinely answerable for the outcome. The work that remains is smaller in volume and larger in consequence. Build, and lead, for that.