AI Economics · 7 min read · Updated 2026-06-18

The AI Delivery P&L: Why Generation Got Cheap and Your Cost Base Did Not

Boards modelling AI savings on developer headcount are budgeting the cheapest part of the problem. When generation costs nothing, the bill moves downstream — to acceptance, integration, review and ownership — and the evidence says it gets bigger.

By Priyanka Pandey · Founder & Editorial Lead

Reviewed and challenged by Sanjeev Purohit · Principal, Decision Architecture

Built from

Independent research
Data-backed
Original framework
Reviewed with field experience

Last substantively reviewed · 2026-06-18

Part of Trust, Governance & the Economics of AI · The Governance-to-Value Ratio

In brief

Boards modelling AI savings on developer headcount budget the cheapest part of the problem: when generation costs nothing, the bill moves downstream to acceptance, integration, review and ownership — and gets bigger.

The savings are real but local; the slowdown nobody budgeted for is downstream.
The Acceptance Gap now has a P&L line.
Governance turns acceptance into a predictable fixed cost.

Best for

Boards and CTOs budgeting AI’s delivery economics

Generation got cheap. That much is settled. A senior engineer can now conjure a plausible service, a migration script or a week of boilerplate in an afternoon, and the marginal cost of the next thousand lines is trending towards zero. The board hears this and reaches for the obvious lever: developer headcount is the largest line in the engineering budget, generation is the bulk of what developers do, therefore AI savings should track headcount. That syllogism is wrong, and it is wrong in a way that costs real money.

The error is treating software cost as if it lives where the typing happens. It does not. Once code is near-free to produce, the cost does not vanish — it migrates downstream, to the work that turns a generated artefact into something an organisation can actually accept, integrate, review and own. That migration is the central fact of AI delivery economics, and almost nobody is modelling it on the P&L.

The savings are real — and they are local

Start with where the savings genuinely land. McKinsey's State of AI 2025 (analyst) reports that software engineering and IT functions cite the most concrete AI-driven cost reductions — tied specifically to code generation, automated testing and incident resolution. So the generation dividend is real, and it shows up exactly where you'd expect: at the keyboard. The problem is that the same study finds only about 39% of organisations report any enterprise-level EBIT impact from AI at all, and most of those attribute less than 5% of EBIT to it. Roughly 6% qualify as 'AI high performers'. Adoption, in other words, is racing ahead of anything visible on the bottom line.

The MIT NANDA report The GenAI Divide: State of AI in Business 2025 (analyst; primary source) puts the same gap in sharper relief: despite an estimated $30-40bn of enterprise generative-AI spend, roughly 95% of organisations see no measurable P&L return, and only about 5% of integrated pilots extract significant value. Crucially, NANDA names the binding constraint. It is not talent, infrastructure or model quality. It is the lack of learning, integration and contextual adaptation — which is to say, the downstream work, not the generation.

The slowdown nobody budgeted for

If generation is faster, why isn't delivery? Two findings should unsettle anyone forecasting linear savings. METR's randomised controlled trial (academic) took 16 experienced open-source developers across 246 real repository issues and found that allowing early-2025 AI tools made them 19% slower — a measured slowdown, not a speedup. The preprint is on arXiv as 2507.09089 for anyone who wants the methodology. The number that matters most for our purposes is the perception gap: developers predicted AI would speed them up by 24% beforehand, and even after living through a 19% slowdown still believed it had sped them up by 20%. That is not a rounding error. It is a systematic over-estimation of velocity by the very people the savings model trusts to report it.

Google's DORA 2024 Accelerate State of DevOps report (analyst) corroborates the team-level picture: a 25% increase in AI adoption was associated with an estimated 1.5% decrease in delivery throughput and a 7.2% decrease in delivery stability — and this against an adoption base where roughly 75.9% of respondents already rely on AI for at least part of their work. Individual productivity, felt at the keyboard, simply did not translate into team-level delivery gains. The friction reappeared somewhere else.

Where? GitClear's AI Copilot Code Quality research (press; vendor-published, so read it as directional) analysed over 211 million lines and found that in 2024, copy-pasted or cloned code overtook refactored 'moved' code for the first time, with code-clone frequency rising roughly eightfold. That is the downstream cost made visible. Cheap generation produces more code, more duplication and more divergence — and every clone is a future review, a future bug surface, a future maintenance liability that lands on a team months after the generation 'saving' was booked.

The bill for cheap generation does not disappear. It is rebilled downstream — to review, integration and operational ownership — and it arrives after the savings have been celebrated.

The Acceptance Gap, now with a P&L

This is the Acceptance Gap rendered in money. The distance between code that exists and code an organisation will accept into production — verified, integrated, owned — is where the cost reaccumulates. Gartner (analyst) predicts at least 30% of generative-AI projects will be abandoned after proof of concept by end-2025, citing poor data quality, inadequate risk controls, escalating costs and unclear business value, and notes plainly that GenAI costs 'aren't as predictable as other technologies'. Unpredictable downstream cost is precisely what a headcount-based savings model fails to capture.

Governance turns acceptance into a fixed cost

And the downstream bill is increasingly non-optional, because regulators are codifying it. NIST released the Generative AI Profile (NIST AI 600-1) on 26 July 2024 (governance) as a cross-sectoral companion to the AI RMF 1.0, defining 12 GenAI-specific risk categories and over 200 suggested govern, measure and manage actions — formalising operational ownership as an explicit obligation. The EU AI Act (governance) entered into force on 1 August 2024, with high-risk obligations — risk management, data governance, technical documentation, human oversight and post-market monitoring — plus deployer duties applying from 2 August 2026. And the UK Government's AI Playbook (GDS, 10 February 2025; governance) sets ten principles including 'meaningful human control at key decision points' and a mandated AI Systems Inventory.

Read those three together and the conclusion is structural: review, human oversight and operational accountability are no longer discretionary engineering hygiene. They are becoming legal and regulatory fixtures. You cannot generate your way past them, and you cannot price them at zero.

What to actually budget

Stage	AI effect on cost	Net
Generation / writing code	Falls sharply	The cheapest part — and the part being budgeted
Acceptance & review	Rises — more change to verify	The new bottleneck
Integration & ownership	Rises — more surface to operate	Often the biggest line

Budgeting AI on developer headcount budgets the cheapest part — the bill moves downstream, and grows.

The correct unit of account is not lines generated or developers replaced. It is cost-to-acceptance per change: the fully-loaded expense of taking a generated artefact through verification, integration, review and into ownership under your governance obligations. Model that, and the AI business case stops looking like a headcount cut and starts looking like a reallocation — fewer hours producing, far more hours accepting. The organisations in NANDA's 5% are the ones that funded the downstream.

If your AI savings case is anchored to developer headcount, you are budgeting the cheapest part of the problem and ignoring where cost actually lands.
Treat reported velocity gains with suspicion — METR's 20-point perception gap means self-reported speedups are not reliable inputs to a P&L.
Make cost-to-acceptance a measured line item, not an assumption. If you cannot measure it, you cannot prove the saving.

If you are pressure-testing an AI savings case before it reaches the board, start with The Acceptance Gap to see where the cost reaccumulates, then read Measuring AI Engineering to instrument cost-to-acceptance. If the question is whether to build, buy or generate the capability at all, Build, Buy or Generate frames the trade — and Why Product Transformations Fail explains what happens when the downstream work is left unfunded.

Our perspective

The common view

AI saves money by cutting developer headcount.

The Ivaaya view

Headcount is the cheapest input; cheap generation rebills cost downstream to acceptance, integration and ownership — budget the gap, not the typing.

“We’ll save on engineers.”: — That is the cheapest line; the new cost lands downstream in review, integration and operations — usually bigger.

If you’re doing this tomorrow

Budget acceptance, integration, review and operational ownership — not headcount savings.
Use governance to convert variable acceptance cost into a fixed, predictable one.

Where teams go wrong

Modelling savings on headcount alone
Celebrating local savings before the downstream bill arrives
No P&L line for acceptance and integration

At a glance

What: The real cost model of AI delivery.
Why: Generation savings are local; the bill moves downstream.
When: Budgeting or board-level AI delivery cases.
When not: A genuinely isolated, fully-verified automation.

The evidence & related ideas →

What we’ve observed

METR’s RCT shows experienced developers ~19% slower with AI — the downstream slowdown boards do not budget.
Gartner projects ~30% of GenAI projects abandoned after proof of concept; MIT NANDA’s 2025 report documents the “GenAI divide” between pilots and value.

How certain are we?

Cheap generation rebills cost downstream — established: Observed repeatedly across delivery programmes.
Most AI delivery value is lost in the acceptance/integration gap — observed: Seen consistently in our own work.

About the author

Priyanka Pandey

Founder & Editorial Lead

Priyanka Pandey founded Ivaaya and leads its editorial voice, translating real delivery experience into practical thinking on AI-native engineering, decision-making and technology leadership. Her work focuses on helping senior leaders make sense of the changes reshaping software delivery without adding to the noise.

Reviewed and challenged by

Sanjeev Purohit

Principal, Decision Architecture

Sanjeev works across enterprise architecture, product strategy and AI-native delivery. The ideas in this article have been challenged against real programmes, production systems and organisational decision-making before publication.

Part of 2 perspectives

Related thinking

Compare notes

If the savings your board modelled on headcount aren’t showing up — because the cost moved downstream to review, integration and ownership — tell us where the bill is actually landing for you. We’re comparing notes with teams re-drawing the delivery P&L now generation is cheap.

Where did your cost move? →

This made me think of…