Vibe Coding Won't Fix Your Architecture: What AI Actually Changes in Software Engineering

Summary

AI coding tools accelerate the easy parts of software development (boilerplate, scaffolding, standard patterns) while leaving the hard engineering problems untouched or actively harder. The "vibe coding" movement and the organizational "AI psychosis" Mitchell Hashimoto identified are two symptoms of the same misalignment: confusing typing speed with engineering judgment. The practical outcome mirrors every previous silver-bullet cycle in our industry. You still need engineers who understand systems.

Background & Context

Something shifted in late 2024. Developers who spent years writing boilerplate, fighting configuration files, and Googling error messages suddenly had tools that could generate working code from a natural language description. The term "vibe coding" emerged to describe this workflow: describe what you want, iterate on the model's output until it runs, ship it. Andrej Karpathy's talk on software in the AI era laid out the framing. The traditional loop of think, write, test, debug is being compressed into describe, generate, verify, adjust.

A second shift happened alongside the productivity gains. Mitchell Hashimoto posted that he believes "there are entire companies right now under AI psychosis," organizations making strategic decisions about headcount and process based on an inflated model of what AI can actually do. AWS CEO Matt Garman publicly called replacing junior staff with AI "the dumbest thing I've ever heard." Individual developers report a dependency pattern that one blogger compared to a toxic ex: you keep coming back despite the problems because the productive moments feel too good to give up.

The community is split. A fly.io blog post argued that AI skeptics are "nuts" for dismissing the tools entirely. Meanwhile, developers who shipped side projects over a weekend wonder why everyone doesn't work this way. Both sides are partially right. Both are missing the structural issue.

Technical Deep Dive

Vibe coding works because of what large language models are genuinely good at: pattern matching over massive code corpora. When you prompt an LLM to "build a React component that displays a paginated list of users fetched from a REST API," the model has seen thousands of similar implementations in its training data. It produces working code because the problem space is well-represented. The dev.to 3D printing analogy captures this precisely: vibe coding is to software engineering what 3D printing is to manufacturing. Rapid prototyping. You get shape quickly, iterate on form, validate the idea before committing to production tooling.

The breakdown happens at the boundaries of the training distribution.

Consider what actually makes software engineering hard. System design decisions with competing constraints: your service needs to be eventually consistent, handle 10K requests per second, and maintain sub-50ms p99 latency. The LLM can generate a Redis caching layer or a Kafka pipeline. It cannot weigh the operational burden of Kafka against the simplicity of Redis streams for your specific team size and on-call rotation. It generates what's common, not what's correct for your context.

Debugging emergent behavior in distributed systems. When your production service starts returning 500s at 2 AM and the logs show nothing useful, the LLM cannot help. It wasn't there. It doesn't know your deployment history, your recent config changes, or the subtle interaction between your connection pool settings and the database's idle timeout. This is the core observation from Praveen's post: AI tools generate more code faster, which means more code you didn't write, more code you don't fully understand, and more surface area for bugs you can't diagnose by reading the source.

Maintaining and evolving large codebases over time. A side project built in a weekend with AI assistance works. A production system that needs schema migrations, backward compatibility, deprecation cycles, and incremental refactoring across a team of 15 engineers is a different problem. The LLM generates code that works now. It has no model of the codebase's history, no understanding of why a seemingly redundant abstraction exists (it's there for a planned migration six months out), and no ability to reason about what happens when you change a shared interface that three other teams depend on.

The "AI slop" problem compounds this. When developers accept generated code without understanding it, they introduce patterns and abstractions that look reasonable in isolation but don't fit the system's architecture. I've seen codebases where three different error handling strategies coexist because each generated snippet used a different convention. One module wraps everything in try-catch with generic error objects. Another uses Result types. A third throws custom exceptions with numeric codes. All three patterns work individually. They fail in composition. When an error propagates across module boundaries, the calling code doesn't know what shape to expect.

The dependency cycle works like this. You use AI to generate a feature quickly. It works on the happy path. You hit an edge case. You ask the AI to fix it. It generates a patch. The patch introduces a subtle regression elsewhere because the model has no global view of the codebase. You ask for another fix. Each iteration adds code you haven't carefully reviewed. The codebase grows in size without growing in coherence. You're productive in the short term and accumulating technical debt at an accelerating rate. One developer described this as the "toxic ex" pattern: the tool keeps hurting you, but the moments of productivity feel too good to walk away from.

The context window limitation is the technical root of most of these failures. Even with a 128K or 200K token context, the model sees a slice of your codebase, not the whole system. It cannot track the implicit invariants that hold a codebase together: "the payment service always idempotently retries on 409," "the user service never returns soft-deleted records without the include_deleted flag," "the notification queue expects events in a specific Avro schema version." These are the constraints that make software work in production. They live in documentation, PR discussions, runbooks, and the heads of senior engineers. They do not fit in a prompt.

Comparison & Analysis

This pattern maps closely onto the offshore outsourcing wave of the early 2000s. The parallels are specific enough to be instructive.

In 2003, organizations believed they could replace expensive local developers with cheaper offshore teams at a fraction of the cost. The pitch was compelling on a spreadsheet. What actually happened was more nuanced. Well-defined, self-contained tasks (data entry screens, report generators, CRUD applications) could be outsourced effectively. Tasks requiring deep understanding of business context, system architecture, or cross-team coordination could not.

The communication overhead was the killer. A McKinsey study from that era found that outsourced projects had 30-40% higher defect rates when requirements were ambiguous, which they almost always were. A senior engineer who could specify a feature precisely enough for an offshore team to implement it correctly could often implement it themselves in less total time. The savings came from delegating the typing, not the thinking.

AI coding tools have the same profile. They're effective for well-specified, self-contained tasks where the solution pattern is well-established. They fail on tasks requiring deep contextual understanding, architectural judgment, or cross-system reasoning. The savings come from automating the keystrokes, not the engineering decisions.

The organizational psychosis mirrors the outsourcing psychosis too. Companies that fired their junior developers and outsourced everything discovered, five years later, that they had no senior developers either. Senior developers come from junior developers who grew into the role through years of working on production systems, making mistakes, and learning the craft. AWS's Garman made exactly this point: if you replace junior staff with AI, you eliminate the pipeline that produces senior engineers. You save money this quarter and destroy capability next decade.

There is a critical difference. Offshore teams could learn, ask clarifying questions, and develop institutional knowledge over time. A developer in Bangalore who has worked on your system for two years understands its quirks. LLMs cannot do this. Every session starts from near-zero context. The model doesn't remember that you prefer composition over inheritance, that your team has a specific error handling convention, or that the payment module has a non-obvious constraint around idempotency keys. You can add this to the system prompt, but you're now doing the work of specifying the context that the "automation" was supposed to save you from specifying. The specification problem doesn't disappear. It moves.

The no-code/low-code movement followed a similar arc. Tools like Bubble, Retool, and OutSystems promised that non-technical users could build production applications without writing code. They work well for internal tools with standard CRUD patterns. They break when you need custom business logic, complex integrations, or performance optimization at scale. The same boundary keeps appearing: tools that automate the easy parts hit a wall at the hard parts, and the hard parts are where engineering value actually lives.

Practical Implications

Where AI coding tools deliver real value:

Generating boilerplate and scaffolding. If you need a standard CRUD API with validation, authentication middleware, and database migrations, an LLM produces a solid first draft in seconds. This is genuine productivity gain, not illusion.

Exploring unfamiliar libraries and APIs. Instead of reading sparse documentation, you can ask the model to generate examples and explain interfaces. Particularly valuable for APIs with poor docs or non-obvious usage patterns.

Writing tests for well-understood behavior. Given a function with clear inputs and outputs, LLMs generate thorough test suites including edge cases you might not think of immediately. The model has seen more test patterns than any individual developer.

Prototyping and proof-of-concept work. When you need to validate an idea quickly, vibe coding is the right approach. Ship the prototype, learn from user behavior, then decide whether to rebuild it properly. The Open Vibe project from Wasp targets exactly this use case: get a SaaS MVP out the door with AI assistance, accept the technical debt as a deliberate trade-off.

Where AI coding tools create net-negative outcomes:

Architecture and system design. The model optimizes for common patterns, not your specific constraints. Use it to explore options and generate alternatives. Do not use it to make binding decisions.

Debugging production incidents. The model wasn't there. It doesn't have the context. Time spent prompting it is almost always better spent reading logs, checking metrics, and tracing requests through your observability stack.

Code review of AI-generated output. If you're accepting generated code, you need to review it as carefully as you'd review a junior developer's first PR. The model makes mistakes a human never would (calling non-existent API methods, inventing configuration options that don't exist, mixing incompatible library versions) alongside mistakes that look human (off-by-one errors, missing null checks, race conditions in concurrent code). The non-human mistakes are actually harder to spot because they don't follow the patterns you've learned to catch in code review.

For organizations, the implication is straightforward. Don't restructure your engineering team around AI tooling. Invest in AI tools to make your existing team more effective at the tasks where they help, and be honest about where they don't. If you're considering reducing junior headcount because "AI can do that work," you're making the same mistake companies made with outsourcing twenty years ago. You're trading short-term cost savings for long-term capability destruction. The juniors you don't hire today are the seniors you can't promote in five years.

For individual developers, the practical guidance is to treat AI tools like a very fast, very prolific junior developer who has read every programming book ever published but has never worked on your codebase, your team, or your domain. Use them for what they're good at. Review everything they produce. Don't let the speed of generation substitute for the speed of understanding.

The developers who will thrive in this environment are the ones who can hold two ideas at once: AI tools are genuinely useful for specific tasks, and AI tools cannot do the hard parts of engineering. The psychosis comes from believing only one of these. The skeptics miss the real productivity gains. The enthusiasts miss the accumulating debt. The engineers who see both clearly will build the systems that actually work.

References

"Does AI Behave Like a Toxic Ex?" — https://dev.to/konark_13/does-ai-behave-like-a-toxic-ex-498n
"You don't 3D print a house. You print your tools." — https://dev.to/aws-heroes/you-dont-3d-print-a-house-you-print-your-tools-2h00
"AI slop is everywhere. Here's what I keep coming back to." — https://dev.to/marvsonhelbs/ai-slop-is-everywhere-heres-what-i-keep-coming-back-to-1i42
"AI Didn't Make Software Engineering Easier. It Made the Hard Parts Harder." — https://dev.to/iampraveen/ai-didnt-make-software-engineering-easier-it-made-the-hard-parts-harder-39n4
"Open Vibe — Ship your SaaS with AI. Without getting stuck." — https://dev.to/wasp/open-vibe-ship-your-saas-with-ai-without-getting-stuck-e2h
"How I Stopped Despairing Over the Backyard Mess and Started an AI Side Project" — https://dev.to/cathylai/how-i-stopped-despairing-over-the-backyard-mess-and-started-an-ai-side-project-3f9a
"My AI skeptic friends are all nuts" (fly.io) — https://fly.io/blog/youre-all-nuts/
Mitchell Hashimoto on AI psychosis — https://twitter.com/mitchellh/status/2055380239711457578
"AWS CEO says using AI to replace junior staff is 'Dumbest thing I've ever heard'" — https://www.theregister.com/2025/08/21/aws_ceo_entry_level_jobs_opinion/
"Andrej Karpathy: Software in the era of AI" — https://www.youtube.com/watch?v=LCEmiRjPEtQ

Vibe Coding Won't Fix Your Architecture: What AI Actually Changes in Software Engineering

Summary ​

Background & Context ​

Technical Deep Dive ​

Comparison & Analysis ​