Why AI Coding Tools Only Became Possible in 2024

For four years, Alexander Berger watched Bolt’s predecessor, StackBlitz, struggle. The team had built something technically impressive — a development environment in the browser — but the market wasn’t ready. They were adding AI to code long before the AI was actually good enough to matter.

Then in the summer of 2024, something shifted. Anthropic released Claude 3.5 Sonnet, and suddenly the category that couldn’t exist became inevitable.

“The AI models were not good enough to reliably write code that would not have bugs and actually execute,” Alexander explains. “Last summer, Anthropic came out with Sonnet 3.5, and that was really the go point for this type of tool. The models were good enough.”

The difference wasn’t marketing or timing or luck. It was a specific technical threshold: the model had to be reliable enough that AI-generated code could actually run in production without constant human fixes.

The Reliability Threshold

Before Sonnet 3.5, AI coding assistance existed, but it solved a different problem. Tools like GitHub Copilot helped engineers write faster — they were a productivity multiplier for people who already knew how to code. The gap was too wide for them to bridge: a non-engineer describing “I want a website” and getting something runnable.

StackBlitz’s technical infrastructure was always there. The Web Container technology — which runs a full development environment in your browser without remote servers — was built years earlier. But a powerful delivery mechanism isn’t useful if the content (AI-written code) isn’t reliable enough to deploy.

“We had pushed that product as far as we could,” Alexander recalls. The team was preparing to return capital to investors and shut down. It wasn’t that they lacked vision or engineering talent. They were waiting for a prerequisite: an AI model that could write production-grade code.

Sonnet 3.5 crossed that line. It wasn’t a 10% improvement. It was the difference between “helpful draft that needs heavy editing” and “code you can actually ship.”

From Niche Tool to Category

Once the AI reached threshold reliability, the entire dynamic inverted. For four years, StackBlitz had one viable market: open-source projects and design system teams at enterprises. Developers used it, loved it, but the addressable market was fundamentally capped. Most people can’t code.

With working AI, the same product suddenly addressed everyone. “Most people do not know how to write code,” Alexander notes dryly. That was the feature, not a bug. On October 3, 2024, they launched Bolt — same technology, new positioning. Five months later: $40M ARR. The first 1,000 users were 70% non-coders.

The lesson applies beyond Bolt. Every AI-powered category has an invisible threshold. Before it, the technology is theoretically possible but practically unusable. After it, the market explodes.

For coding, that threshold was Sonnet 3.5’s ability to write code that actually worked. For image generation, it was photorealism. For voice, it was conversational naturalness. For reasoning, it’s still being crossed — models are getting better at complex multi-step problems.

What This Means for Startups in AI

Alexander survived four years of near-failure by staying lean. He didn’t waste money trying to force product-market fit. Instead, he preserved capital and waited.

“If you’re running a startup and you are not 100% sure that you have product market fit, be very careful with your cash,” he advises. “We ran the company very lean and we were very just focused on traditional good company building, preserving cash.”

That wasn’t passive waiting. The team kept shipping, kept learning, kept positioning the product for the moment when the underlying technology caught up. When Sonnet 3.5 arrived, Bolt was ready.

For founders building in AI now, the parallel is clear: understand what technical threshold your category needs. If your product idea depends on AI being able to do X reliably, don’t assume you can force it faster than the research community can deliver it. Instead, build defensively around what the AI can do today, position for what it will do tomorrow, and be ready to move fast when the threshold flips.

The companies that will thrive in the next wave aren’t necessarily the ones with the most clever ideas. They’re the ones that survive long enough for the technology to catch up.

FAQ

When did the first AI coding assistants launch?

GitHub Copilot (2021) was early, but it assisted developers in writing faster — it didn’t enable non-coders to build. Claude 3.5 Sonnet (summer 2024) was the inflection point because it could write reliable, deployable code from scratch, which meant non-engineers could finally use AI coding tools as their primary builder.

What made Claude 3.5 Sonnet different from earlier models?

Earlier models made too many coding mistakes. They’d generate syntax errors, logic bugs, or non-functional code that needed heavy developer revision. Sonnet 3.5 crossed a reliability threshold — the code it generated actually ran and was production-ready for many use cases, removing the need for constant human fixes.

Could Bolt have launched with a different model?

Possibly with another frontier model, but it needed to cross the reliability threshold. Alexander states clearly that earlier versions of the models weren’t good enough. The timing matters because Anthropic released Sonnet 3.5 when the team had already built the Web Container infrastructure and was ready to pivot.

Why did the team wait four years instead of pivoting sooner?

The AI wasn’t ready. Adding AI to StackBlitz before Sonnet 3.5 would have created a tool that generated buggy code, requiring developers to fix it. That doesn’t solve the “non-coders can’t build” problem — it just shifts who has to do the coding. They preserved cash until the underlying technology made the pivot worth it.

Is the AI coding category saturated now?

Multiple competitors have launched (Cursor, Lovable, Vassal, Google’s Project Astra). Alexander sees Bolt as differentiated because it’s easier to use (browser-based, no installation) and more enterprise-ready (integrates design systems, handles hallucinations with custom solutions). But the category is definitely in the “land rush” phase.

What’s the next technical threshold for AI coding?

Alexander mentions hallucinations and accuracy as the current challenge. The next threshold is probably fully automated handling of enterprise code standards, design systems, and security requirements. Right now, they deploy engineers into customer companies to solve this. Once productized, that becomes a game-changer.

Will other AI models enable competing tools to catch up?

Yes. As other models improve (GPT-5, Gemini 2, future versions of Sonnet), the competitive field will expand. But Bolt’s timing advantage — launching at the threshold moment with infrastructure already built — gives them a distribution and market leadership head start.

What does this teach founders about AI timing?

Wait for the threshold, but don’t wait passively. Build around what’s possible today, position for what’s coming, and be ready to move when the underlying technology crosses the line. Conservative cash management buys the runway to make that pivot.

Why AI Coding Tools Only Became Possible in 2024

The Reliability Threshold

From Niche Tool to Category

What This Means for Startups in AI

FAQ

More from Alexander Berger

Related Insights

Are Websites Going Away? One AI Engineer's Case for llms.txt Replacing Landing Pages

Why Building Your Own AI Ad System Is a Worse Idea Than You Think

Why Statisticians and Control Engineers Disagree About AI Hallucination