I built a working system in five days last week. The kind of thing that would have been a six-month project with a small team a decade ago. I am genuinely amazed at what AI tools now make possible.
And then I spent another five days making it actually safe to put in front of users.
That second five days is the part nobody is talking about. It does not show up in the demo. It does not appear in the LinkedIn post about how fast you can move now. But it is the difference between a prototype that impresses your colleagues and a system you can responsibly run a business on.
In July last year, an AI agent at Replit deleted a software company’s production database during a code freeze, and then misled the engineer about whether the data could be recovered. Same kind of build. Same kind of speed. The difference between that team’s experience and mine is not the tool. It is what happened — or did not happen — in the days after something seemed to work.
I am not arguing against vibe coding — quite the opposite. I have just done it, at speed, and I plan to do more of it. These tools have made the building far easier, and they have made much of the finishing easier too. They will happily write tests, configure backups, harden security, and set up monitoring — if you ask. The catch is that asking requires knowing what to ask for. That knowledge — the unglamorous understanding of what production-ready actually means — is the gap no tool fills. It is the hidden danger of citizen development: not that the work cannot be done, but that the people doing it do not yet know it needs to be done.
So here are the six things I did between the moment my system “worked” and the moment I was willing to let real users near it. They are also, deliberately, the six questions any CEO should be asking whoever in the business is excitedly waving an AI-built prototype around.
1. Automated tests covering more than 80% of the behaviour.
Not because I love writing tests. Because I had no other reliable way of knowing that the next change I made — or the next thing the AI suggested — had not silently broken something that had been working yesterday. AI tools generate code that looks right at almost every turn. Tests are the thing that tell you whether it is right.
The question to ask: “Show me the tests. What’s the coverage?” If the answer is a vague hand-wave, what you have in front of you is not yet a product.
2. Load testing at well above expected volume.
Demos run for one person clicking gently. Production systems run for hundreds at once, occasionally thousands, often at the worst possible moment. Without testing what the system does under real load, you discover the answer in production, in front of customers.
The question to ask: “What happens at ten times this volume? Have you tested it?”
3. Security testing, and encryption of anything sensitive.
This is where the unskilled builder has the largest blind spot. Recent research from Veracode found that nearly half of AI-generated code samples contain security flaws, with vulnerabilities appearing roughly 2.7 times more often than in human-written code. AI tools will happily wire up an application that works and is also wide open — secrets in the wrong place, no proper authentication, customer data sitting unprotected. None of that shows up in the demo. It shows up six months later, when something has gone wrong.
The question to ask: “Who has reviewed this for security, and what’s encrypted?” “The AI handled it” is not an answer.
4. Deployment scripts, backups, and a tested recovery procedure.
Not just backups — a recovery procedure I had actually run, end to end, and confirmed worked. A backup you have never restored is not a backup. It is a hope.
The question to ask: “If this falls over at nine on a Friday night, walk me through how we get it back. When did you last test that?”
5. Real user trials, and a willingness to act on what came back.
The system that lives only in the head of the person who built it is always more elegant than the one real users meet. Until other people have used it, you do not know what is broken.
The question to ask: “Who has used this besides the person who built it? What did they hit that you didn’t predict?”
6. Error logging — and the habit of actually reading it.
Logging is half the battle. The other half is the discipline of looking at the logs every day and acting on what you see. Most production failures announce themselves in the logs days or weeks before anyone notices. The systems that fail loudly are the ones nobody was listening to.
The question to ask: “When something goes wrong tonight, how do we know — and who looks?”
Not every AI build needs all six. An internal experiment used by three colleagues is a different beast from a customer-facing system handling payments and personal information. The mistake is treating the second like the first — pick your blast radius consciously, and the bigger the radius, the more of the six you cannot skip.
Gartner expects citizen developers to outnumber professional developers four to one by next year, and predicts a 2,500% rise in software defects from prompt-to-app development by 2028. Most of those builders will not have a second five days. That is not a rounding error — it is an industry quietly walking into a quality crisis.
None of these six are new. They have been the cost of shipping software for thirty years. AI builders have not reduced that cost — they have just made it possible to skip it without realising.
This is the thing the marketing leaves out: AI tools are amplifiers. Bring twenty-five years of experience to one and it will turn five days into a system that works. Bring none, and it will turn five days into a system you do not realise is broken until somebody else finds out. The tool does not bring the judgement. You do.
The tools are extraordinary. Use them. Use them more. But the day you got something working is the day the real work began — and you can usually tell, in any room, who has not yet realised that.
Colin Rees is the founder of Xpera, where we help business leaders make smarter technology decisions. If you are thinking about deploying AI in your business — or wondering whether what you have already built is genuinely production-ready — we would welcome a conversation.

