AI · SYSTEMS2026-06-29·8 min read

You can't vibe-code a business system — what AI-built software costs you after the demo

Q: Where exactly is the line between "safe to vibe-code" and "must be engineered" — is the trigger customer data, money, headcount, regulatory exposure, or simply the cost of the thing failing at 2am?

Open question — discussed in the perspectives section of this article. No settled answer yet; see the linked references for current positions.

Q: As models improve, does the easy 70% creep toward 95% — or does the hard 30% (security, integration, judgment about failure) stay stubbornly human no matter how capable the model gets?

Open question — discussed in the perspectives section of this article. No settled answer yet; see the linked references for current positions.

Q: If a vibe-coded prototype is already running your business, what is the honest call — rebuild it properly now, or wait until it breaks, knowing the rebuild only gets more expensive the longer it carries weight?

Open question — discussed in the perspectives section of this article. No settled answer yet; see the linked references for current positions.

Q: How do you even audit AI-generated code you did not write and do not fully understand — what does "we reviewed it" mean when the volume is several times what a human would have typed by hand?

Open question — discussed in the perspectives section of this article. No settled answer yet; see the linked references for current positions.

Q: When AI-written code leaks customer data, who is actually liable — the business that shipped it, the developer who prompted it, or no one, because everyone assumed someone else had read it?

Open question — discussed in the perspectives section of this article. No settled answer yet; see the linked references for current positions.

Describe an app in plain English and an AI writes it; the working demo is real and it lands in an afternoon. The trap is mistaking the demo for the system. The demo is the easy 70%. The 30% AI cannot finish — security, edge cases, integration, the upkeep — is the whole job of running a business on it.

By Felukaa

[ THE SHORT VERSION ]

There is a new pitch, and the remarkable thing about it is that it is mostly true. You describe what you want in plain language, an AI coding tool writes it, and minutes later something runs. A form, a dashboard, a little app — working, on screen, in front of you. For a generation of owners who were quoted months and five figures for exactly that, it feels like a wall just fell. And in a real sense it did: producing a first working version of software has genuinely become fast and cheap. That part is not hype.

The term for building this way got a name in early 2025 — "vibe coding": describe the goal, accept whatever the model writes without reading it closely, and fix problems by asking again rather than understanding the code. It was useful enough as a description that a dictionary made it a word of the year. For a prototype, a throwaway script, a weekend experiment, it is a gift — the blast radius is zero and the speed is intoxicating. The danger begins only at one specific moment: when the thing you vibe-coded quietly graduates into something your business actually depends on, and the 30% nobody looked at becomes the part that holds your customers, your money, and your operations.

This piece is about that gap — between a demo that runs and a system you can run a business on. Why AI gets you most of the way fast and then stalls on the part that is the actual work; what the data already shows about the code it leaves behind, on security and on the slow rot of maintainability; and the one decision — who owns the code — that separates a tool that pays from a liability nobody in the building understands.

[ FIGURES ]

Figure 1 · The demo is the easy 70%

An AI coding tool reliably produces the scaffolding and the happy path — roughly 70% of the code — fast. The last 30% is the edge cases, the error handling, the security, the integration with the systems you already run, and the maintenance afterward. That 30% is most of what separates a demo that runs once from a system a business runs on every day — and it is exactly the part the model does not finish.

Figure 2 · What the demo hides

Three independent studies converge on the same uncomfortable picture. In the largest controlled test to date, about 45% of AI-generated code arrived with a security vulnerability. In a randomised trial, experienced developers were ~19% slower with AI tools while believing they were faster. And across a 200-million-line audit, copy-pasted code climbed as AI scaled while the cleanup that keeps code maintainable collapsed. More code, faster-feeling, less safe, harder to keep.

[ EXPLANATION ]

Start with the honest part, because the case here is not anti-AI. Building software with AI assistance is real leverage and we use it every day. The thing to be precise about is the method. "Vibe coding" was named in early 2025 to describe a specific way of working: state what you want in natural language, accept the model's output without reading it line by line, and resolve problems by re-prompting rather than reasoning through the code ^[1]. For a disposable prototype that is a superpower. The failure mode is not the tool — it is letting code built that way carry weight it was never inspected to carry.

The clearest description of why comes from inside the field: the 70% problem. An AI tool will get you about 70% of the way to a working feature astonishingly fast — the scaffolding, the obvious patterns, the path where everything goes right. The remaining 30% — the edge cases, the error handling, the security, the integration with the messy systems you already run, the informal business rules nobody wrote down — stays exactly as hard as it always was ^[2]. And that 30% is not a rounding error; it is most of what makes a system a system. The demo where the form submits is the easy part. The system that submits, validates, reconciles the duplicate, handles the refund, survives the outage, and stands up to an audit is the part the model leaves for you.

The most measurable piece of that 30% is security, and the numbers are sobering. The largest controlled study to date ran more than a hundred models across 80 coding tasks and found AI-generated code introduced a security vulnerability in about 45% of cases — and, tellingly, the rate did not improve as the models got more capable at writing code that works ^[3]. The pressure that makes a model better at producing functioning code does not make it better at producing safe code; those are different skills, and only one of them is being optimised. Roughly one in two AI-written solutions ships with a hole. On a marketing page nobody cares. In a system that holds customer data or moves money, that is the breach — built in from the first line, by no one who can be asked why.

Then there is the quieter cost: maintainability, which is where most of a system's lifetime money actually goes. An audit of more than 200 million changed lines of code found that as AI assistance scaled, copy-pasted and cloned code climbed sharply — duplicated blocks multiplied and the share of "cloned" lines rose from 8.3% to 12.3% between 2021 and 2024 — while the proportion of work that was refactoring, the cleanup that keeps a codebase coherent, collapsed from a quarter of changes to under a tenth ^[4]. In plain terms: the machine produces more code, more of it duplicated, and almost none of the structural care that keeps a system changeable. It accretes instead of being built. That is the opposite of why anyone commissions a custom system — you wanted something you could keep adapting, and you got something that gets harder to touch every month.

There is also a trap in the feeling of speed itself. A randomised controlled trial — the gold-standard design — gave experienced developers AI tools on large, real codebases and measured what happened. They were about 19% slower with the AI than without it, while estimating they had been 20% faster ^[5]. The gap was the validation tax: time spent reading, double-checking, and repairing output that looked right and was not. The lesson for an owner is not "AI is useless" — it is that "faster" is often a perception, and the perception is exactly what makes vibe-coding feel safe when it is not. Speed you cannot verify is not speed; it is debt that has not shown up yet.

So what is the actual decision? It is not whether to use AI in the build — that question is settled, and the answer is yes. It is whether to ship code no human owns. The real failure mode is not "a model wrote it"; it is "a model wrote it, nobody read it, and now nobody can change it or say whether it is safe" — secrets pasted into the open, credentials scattered through the codebase, a known vulnerability sitting and waiting, because the security review that catches all of that is precisely the step vibe-coding skips ^[6]. The fix is unglamorous and it is the whole game: AI as an accelerator for engineers who read, own, and stand behind every line that runs your business — not as a substitute for the person who does. The 30% is where the business lives, and the 30% is human. Ask who owns the code before you fall in love with the demo.

[ PERSPECTIVES ]

Camp A — Vibe coding is the future; the hand-wringing is the buggy-whip lobby

The models get better every month. Today's shaky 70% is next year's solid 95%, and the people fretting about clone rates and review discipline are defending a craft the way scribes defended handwriting. Build fast, ship, iterate, and let the tools close their own gaps as they improve. A working thing in your customers' hands today beats a beautifully engineered thing six months from now. Speed of iteration is the only moat that matters.

Camp B — Fine for the disposable, fatal for the load-bearing

The right move is to draw a line and defend it. Prototypes, internal tools, throwaway tests, a marketing microsite — vibe-code all of it, the blast radius is zero and the speed is free money. But the instant code touches customer data, money, or core operations, it crosses into territory where "45% ships with a flaw" is simply unacceptable. The actual mistake businesses make is letting a prototype quietly slide into production without anyone ever stopping to rebuild it properly.

Camp C — The bottleneck was never typing the code

Senior engineering was never about producing lines quickly; it was judgment — what to build and what not to, how the thing fails, who it locks out, how it is secured and kept alive for years. AI accelerates the part that was never the constraint and leaves the real constraint untouched. So the value of someone who owns the 30% goes up, not down. The more code the machine emits, the more you need a human who can tell the safe 70% from the dangerous 30% — and that person just got scarcer relative to the volume.

Where we land

All three are partly right and they resolve cleanly. Camp A is correct that AI is permanent leverage — we would be foolish not to use it, and we do. Camp B draws the only line that matters: vibe-code the disposable, never the load-bearing. Camp C names why the line is there — the bottleneck was always judgment, and judgment is exactly the 30% the model cannot finish. We build with AI and against vibe-coding: every line that runs your business is read, owned, and stood behind by a human who can be asked why. The demo is the easy 70%. We are hired for the 30% — which is precisely the part you can never afford to vibe.

[ OPEN QUESTIONS ]

01Where exactly is the line between "safe to vibe-code" and "must be engineered" — is the trigger customer data, money, headcount, regulatory exposure, or simply the cost of the thing failing at 2am?

02As models improve, does the easy 70% creep toward 95% — or does the hard 30% (security, integration, judgment about failure) stay stubbornly human no matter how capable the model gets?

03If a vibe-coded prototype is already running your business, what is the honest call — rebuild it properly now, or wait until it breaks, knowing the rebuild only gets more expensive the longer it carries weight?

04How do you even audit AI-generated code you did not write and do not fully understand — what does "we reviewed it" mean when the volume is several times what a human would have typed by hand?

05When AI-written code leaks customer data, who is actually liable — the business that shipped it, the developer who prompted it, or no one, because everyone assumed someone else had read it?

[ REFERENCES ]

[1]Wikipedia — Vibe coding: term coined by Andrej Karpathy in February 2025 for describing software in natural language and accepting AI output without close review; named a 2025 word of the year.
Verify Archive
[2]Addy Osmani — "The 70% problem: hard truths about AI-assisted coding": AI produces ~70% of a feature fast; the final 30% (edge cases, security, integration) stays as hard as ever.
Verify Archive
[3]Veracode — 2025 GenAI Code Security Report: across 100+ LLMs on 80 tasks, AI-generated code introduced a security vulnerability in ~45% of cases, with no improvement as models grew more capable.
Verify Archive
[4]GitClear — AI Copilot Code Quality, 2025: across 200M+ changed lines, copy-pasted/cloned code rose from 8.3% to 12.3% (2021–2024) while refactoring fell from ~25% of changes to under 10%.
Verify Archive
[5]METR — Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity: a randomised trial found developers ~19% slower with AI tools while believing they were ~20% faster.
Verify Archive
[6]Cloud Security Alliance (research labs) — Vibe Coding Security Crisis: AI-generated code is driving credential sprawl and security debt, with secrets and known-vulnerability exposure concentrated in unreviewed AI output.
Verify Archive

[ Got a demo that works and a business that needs more? ]

We build with AI and against vibe-coding — every line that runs your business read, owned, and stood behind by a human.

The demo is the easy 70%. The system your business actually runs on — secure, integrated, maintainable, owned by you — is the 30% AI cannot finish, and it is the part we are hired for. Fifteen minutes to tell whether what you have is a prototype or a system.

Book a free 15-min consultation