AI Engineering · 6 min read

Beyond Vibe Coding

Vibe coding optimises the step that just became abundant and starves the one that just became scarce. Fine for exploration. Dangerous as an operating model.

Part of Agentic Engineering · The AI Engineering Maturity Model

In February 2025 Andrej Karpathy described a way of working he called vibe coding: you 'fully give in to the vibes, embrace exponentials, and forget that the code even exists'. It was only possible, he noted, because the models had got too good to argue with. The phrase struck a nerve. By the end of the year Collins had named it Word of the Year. So this is not a strawman we are about to knock down. It is a defined, mainstream practice, real enough to argue against on the merits.

Here is the argument. Vibe coding is a rational response to a genuine shift, pointed at the wrong half of it. It optimises generation, the step that has just become abundant, and quietly skips acceptance, the step that has just become scarce. That is fine when there is nothing to accept. It is dangerous the moment you make it the way an organisation works.

What the vibe actually skips

The pillar that runs through everything we write here is the Acceptance Gap: once a model can produce a plausible change in seconds, the binding constraint is no longer writing the code, it is the judgement that the change is correct, safe and worth shipping. Value migrates to that scarce step. Vibe coding is interesting precisely because it is the purest possible expression of ignoring it.

The academic record is unusually clear on the mechanism. A 2025 study of vibe-coding practice by Sarkar and Drosos found developers reviewing generated output through 'impressionistic scanning rather than a linear read', accepting large diffs within seconds. They name the danger directly: 'risks of overtrust emerge when users fail to critically evaluate AI outputs'. Read that again. The thing being scanned past in seconds is the acceptance step. The vibe is the sound of the gap going unaddressed.

The same paper makes a more useful observation almost in passing. Vibe coding does not remove expertise; it redistributes it 'toward context management, rapid code evaluation, and decisions about when to transition between AI-driven and manual manipulation of code'. That is a precise description of acceptance work. The skill did not disappear. It moved downstream and got harder to see.

The constraint moved; the cost followed it

If acceptance is now where the difficulty lives, that is where the cost should show up. It does. Google's DORA 2025 research found that with around 90% of technology professionals now using AI at work, higher adoption is associated with an increase in both delivery throughput and delivery instability at the same time. Speed and fragility rise together. Their phrasing is the whole thesis in miniature: time saved in creation is frequently re-allocated to auditing and verification. The work was not eliminated. It was relocated. And tellingly, 30% of developers report little to no trust in the code the tools produce.

The instability is measurable, not vibes. Veracode tested more than a hundred models across eighty tasks and found that 45% of AI-generated code introduced an OWASP Top 10 vulnerability; defences against cross-site scripting passed only 14% of the time, log injection just 12%. Separately, analysis from Apiiro reported a roughly tenfold rise in security findings across six months of 2025 as generation volume climbed. None of this is an indictment of the models. It is an indictment of accepting their output unread.

Vibe coding is not bad engineering. It is good engineering aimed at the wrong constraint: it perfects the step that became abundant and starves the step that became scarce.

Why faster can feel true and be false

The most uncomfortable finding is about perception. METR ran a randomised controlled trial in July 2025 with sixteen experienced open-source developers across 246 real issues on codebases they knew well. With AI tools allowed, they took 19% longer. Afterwards they estimated the tools had sped them up by 20%. That forty-point gap between felt and measured productivity is itself a risk, because an operating model built on how fast the work feels will systematically under-invest in the part that is actually slowing it down.

And the quality drift is visible in the codebase over time. GitClear's analysis of 211 million changed lines found copied-and-pasted code rising from 8.3% to 12.3% between 2021 and 2024, refactoring falling below 10%, and 2024 becoming the first year in which the introduction of duplicated code exceeded refactoring. That is the signature of generation outrunning acceptance, written into the repository one accepted diff at a time.

Exploration versus operating model

None of this means vibe coding is worthless. For a prototype, a throwaway internal tool, a spike to learn an unfamiliar API, it is excellent, and the same sources that warn against it in production endorse it for exploration. The reason is structural rather than moral. In exploration there is nothing to accept. The artefact is disposable, the blast radius is your own afternoon, and the only judgement that matters is whether you learned something. Skipping acceptance there costs nothing because acceptance there has no value.

An operating model is the opposite case. It exists precisely to govern acceptance at scale, across many people, many changes and a long-lived system that other people depend on. To run an organisation on vibe coding is to adopt, as policy, the deliberate skipping of the one step that has become the constraint. That is why governance bodies have started codifying the point: NIST's 2025 draft secure-development guidance and the NCSC's guidelines for secure AI system development both push 'secure by default' and embedded verification, treating human and automated acceptance as mandatory rather than optional.

The industry has already turned. Thoughtworks' Technology Radar in November 2025 observed that vibe coding 'has practically disappeared', replaced by 'context engineering', and listed 'complacency with AI generated code' as an explicit antipattern, noting that the early enthusiasm 'exposed a degree of complacency about what AI models can actually handle'. Every one of these sources is circling the same thing from a different side: DORA calls it verification overhead, Thoughtworks calls the fix context engineering, the academics call it redistributed expertise, Veracode and GitClear count the downstream damage. They are all measuring the Acceptance Gap. None of them names it as a single economic shift.

That naming is the work. Vibe coding sits comfortably at the Experimentation and Assistance rungs of the maturity model, and it should. What carries you past them is not better prompting but an AI-native operating model in which roles move from doing to directing, and acceptance becomes a designed, governed capability rather than a few seconds of impressionistic scanning. The binding constraint has moved. The work moves with it. If you want to know where your organisation actually sits on that path, start with our five-stage maturity model and read what agentic engineering looks like once acceptance is something you run on purpose.