AI Engineering · 6 min read · Updated 2026-06-18

The Agentic SDLC Is an Acceptance-Gate Problem

AI now reaches across the whole lifecycle, not just the IDE. The design question is no longer where agents act, but where a human must accept, and what evidence the agent owes them at each gate.

By Priyanka Pandey · Founder & Editorial Lead

Reviewed and challenged by Sanjeev Purohit · Principal, Decision Architecture

Built from

Independent research
Original framework
Reviewed with field experience

Last substantively reviewed · 2026-06-18

Part of Agentic Engineering · The AI Engineering Maturity Model

In brief

The agentic SDLC reframes the lifecycle around acceptance: the design question is no longer “where do agents act?” but “where must a human accept, and what evidence does the agent owe them at each gate?”

AI now reaches across the whole lifecycle, not just the IDE.
Bolting agents into the IDE buys local speed-ups and system-level slowdown.
The spec becomes the primary artefact; acceptance gates are the architecture.

Best for

Designing how agents move work across discovery → delivery

Not for

Single-step IDE autocomplete with no downstream gates

The phrase 'agentic SDLC' is usually sold as either an automation story (how much of the lifecycle can agents own?) or a tooling story (which IDE or agent should we buy?). Both questions are the wrong altitude. If you accept the Acceptance Gap thesis, the interesting question is not where agents act. It is where a human must accept, and what evidence the agent is obliged to put in front of that human at each point.

We have argued before that once generation becomes abundant, the binding constraint moves to acceptance: the judgement that a change is correct, safe and worth shipping. The agentic SDLC is simply that argument applied across the full lifecycle rather than at the moment of writing a function. Discovery, specification, planning, implementation, review, deployment and operation are not stages where AI either does or does not help. They are a chain of acceptance decisions. Is this spec correct? Is this plan safe? Is this change worth shipping? Is this behaviour healthy in production? Design the lifecycle so that every agent action terminates in an evidence-backed gate owned by a named human role, and most of the failure modes that currently dominate the discourse become predictable rather than mysterious.

The trajectory everyone agrees on

There is unusual consensus on the direction of travel. Thoughtworks' Technology Radar volume 33 describes AI embedding across the entire software development value chain, having moved within a year from retrieval and prompt engineering to context engineering, MCP and agentic systems. Independent surveys, vendor framings from EPAM and GitHub, and the academic literature on agentic SDLCs all tell the same three-act story: 2023 to 2024 was coding and unit-test assistance; 2025 widened to design, test and documentation; 2026 is orchestrated, end-to-end automation. The expansion is real and it is fast. SWE-bench Verified, a reasonable proxy for autonomous task capability, rose from roughly two per cent to seventy-eight per cent between late 2023 and early 2026.

Capability is not the constraint. Absorption is. Gartner expects more than forty per cent of agentic AI projects to be cancelled by the end of 2027 over escalating cost, unclear value and inadequate risk controls, and warns that of thousands of self-described agent vendors only around a hundred and thirty are doing anything that merits the label. Capability is improving faster than organisations can govern it, which is exactly the gap we are interested in.

Why 'bolt it into the IDE' produces slowdown

The most instructive result of the past year is uncomfortable for the tooling story. A randomised controlled trial of sixteen experienced open-source developers across two hundred and forty-six tasks found that allowing early-2025 AI tools increased completion time by nineteen per cent. The developers had forecast a twenty-four per cent speed-up, and afterwards still believed they had been sped up by around twenty per cent. Fewer than forty-four per cent of the AI-generated code was accepted without modification.

Read through the acceptance lens, this is not evidence that AI is bad. It is evidence of what happens when you inject abundant generation into a workflow whose acceptance step is unstructured. The generator produces plausible code cheaply; the human is left to verify it with nothing but their own reading, and verification is the expensive part. DORA's 2025 findings corroborate this directly: AI accelerates creation, but the time saved is re-allocated to auditing and verification, and the acceleration exposes downstream weaknesses. Around ninety per cent of technologists now use AI at work and over eighty per cent believe it raised their productivity, yet thirty per cent report little or no trust in AI-generated code. The gap between felt speed and measured speed is itself the finding, and it is the strongest possible argument for embedding objective acceptance evidence rather than trusting how fast the work feels.

The tension the lifecycle has to resolve

DORA's headline tension is that AI adoption now correlates positively with delivery throughput and with product performance, and negatively with delivery stability. More speed and more instability, at the same time. Their explanation is the one to internalise: AI is a mirror and a multiplier. It does not fix a team; it amplifies what is already there. A lifecycle with strong testing, version control, fast feedback and a healthy platform converts AI into throughput. A lifecycle without those controls converts the same AI into instability at higher volume. The agentic SDLC does not get to choose whether AI amplifies. It only gets to choose what is being amplified.

The design question is not 'where do agents act?' but 'where must a human accept, and what evidence does the agent owe that human at each gate?'

Spec as the primary artefact, gates as the architecture

Two design moves follow. First, the specification becomes the source of truth and code becomes its expression. GitHub's Spec Kit (Spec, Plan, Tasks, Implement) and EPAM's formulation that humans express intent while agents execute converge on the same intent-first shape, in deliberate contrast to prompt-and-pray IDE use. When intent is the durable artefact, an agent's output can be checked against something other than a reviewer's intuition.

Second, judgement gates must be designed in, not assumed. The convergence here is striking across very different sources. Thoughtworks warns explicitly against complacency with AI-generated code and calls for sustained human judgement, oversight and healthy scepticism. GitHub's Spec Kit bakes in explicit checkpoints to critique, spot gaps and course-correct. EPAM's AI-driven lifecycle defines human-in-the-loop escalation and autonomy boundaries. NIST's agentic profile for its risk-management framework adds delegation-chain accountability and runtime governance. Strip the branding and they are all describing the same primitive: an acceptance gate, owned by a named human, fed by evidence the agent is obliged to produce.

Made concrete, the lifecycle becomes a chain of four such gates. Is this spec correct, with the agent owing traceability back to the discovery intent? Is this plan safe, with the agent owing a blast-radius and dependency analysis? Is this change worth shipping, with the agent owing tests, diffs and a rationale rather than a green tick? Is this behaviour healthy in production, with the agent owing telemetry against the acceptance criteria it was given? Note that the autonomy boundary can move outward gate by gate as evidence accumulates trust. That is the difference between an engineered lifecycle and a copilot purchase.

Why this reframing matters

This is the part the market mostly misses. The METR slowdown and DORA's instability are not arguments against AI. They are the predictable result of bolting generation into the IDE while leaving the acceptance steps unstructured. And Gartner's forty per cent cancellation rate is not a mystery either. Projects fail when they automate generation without engineering the gates, because they generate faster and then discover that the scarce, expensive, un-automated step, acceptance, has become the bottleneck for the whole organisation rather than for one developer. Closing the Acceptance Gap is a lifecycle-design problem, not a tool-selection one.

This maps cleanly onto our five-stage maturity model. Most organisations are somewhere between Assistance and Automation: agents act, but the gates are still informal and human-carried. Agentic Engineering begins precisely when the gates become first-class, evidence-backed and owned, so that autonomy can be widened deliberately rather than hoped into existence. If you want to know which gates your lifecycle is missing, that is the place to start. Map your current practice against the AI-engineering maturity model, then read 'what is agentic engineering' for the operating model that sits underneath it.

Our perspective

The common view

Speed up the SDLC by adding AI into the IDE.

The Ivaaya view

Redesign the lifecycle around human acceptance gates and the evidence agents owe at each — that, not IDE autocomplete, is where speed and safety come from.

“Just add Copilot everywhere.”: — IDE-local speed-ups push unverified work downstream; without acceptance gates the system slows and risk rises.

If you’re doing this tomorrow

Make the spec the primary artefact agents work from.
Place explicit acceptance gates where a human must decide, and define the evidence required at each.

Where teams go wrong

Bolting agents into the IDE and ignoring the gates
Pushing unverified work downstream
No defined evidence at acceptance points

At a glance

What: A lifecycle organised around acceptance gates and the evidence agents owe.
Why: AI now spans the lifecycle; the constraint is acceptance, not action.
When: Any team running agents across discovery → delivery.
When not: Pure IDE autocomplete with no downstream consequence.

The evidence & related ideas →

What we’ve observed

METR’s RCT and DORA 2025 show local AI speed-ups can coincide with system-level slowdown when gates are missing.
Gartner projects over 40% of agentic-AI projects will be cancelled by end of 2027 — design for acceptance, not just action.

How certain are we?

Acceptance gates, not IDE placement, govern agentic delivery — observed: Seen consistently in our own work.
Local AI speed-ups can mask system-level slowdown — established: Observed repeatedly across delivery programmes.

About the author

Priyanka Pandey

Founder & Editorial Lead

Priyanka Pandey founded Ivaaya and leads its editorial voice, translating real delivery experience into practical thinking on AI-native engineering, decision-making and technology leadership. Her work focuses on helping senior leaders make sense of the changes reshaping software delivery without adding to the noise.

Reviewed and challenged by

Sanjeev Purohit

Principal, Decision Architecture

Sanjeev works across enterprise architecture, product strategy and AI-native delivery. The ideas in this article have been challenged against real programmes, production systems and organisational decision-making before publication.

Part of a perspective

For the Delivery Leader — Running delivery when agents do the buildingStep 3 of 6 →

Related thinking

Compare notes

If the live question for you is no longer where agents act but where a human still has to accept — and what evidence they’re owed at each gate — tell us where your acceptance points sit today. We’re comparing notes with teams redesigning their lifecycle around those gates.

Where must a human accept? →

This made me think of…