A Staircase Into the Ceiling: What the Winchester House Teaches Us About Vibe Coding
Foo bar baz
In San Jose, California, there's a mansion with a staircase that climbs seven flights and ends flat against the ceiling. There's a door on the second floor that opens to a two-story drop into the garden. A cabinet runs through thirty rooms. At one point there were as many as 500 rooms, 2,000 doors, 10,000 windows — and one shower. If you've inherited a vibe-coded codebase, none of this should surprise you.
The Winchester Mystery House was built by Sarah Winchester, widow of the firearms magnate, from 1886 until her death in 1922. She inherited a fortune worth over $500 million today and she used it to hire carpenters to work around the clock for 38 years. She designed the rooms herself, one at a time, with no master plan. Legend says she held nightly séances to receive the next day's building instructions from spirits. The workforce never questioned the specs – they just built whatever was asked.
The result was a structure that grew enormous and became worthless. Sarah spent $5.5 million on the house. After her death, it was appraised at $5,000. In 1975, workers discovered a sealed room nobody knew existed — two chairs and a phonograph behind a locked door, walled off and forgotten as construction continued around it.
The parallels to vibe coding aren't subtle. An LLM generates whatever you ask for without questioning whether it's structurally sound, reasonable, wasteful or wrong-headed. The cost of adding more is nearly zero, so there's no natural friction from resource constraints that forces you to ask whether you should. And because nobody fully understands what was generated three prompts ago and why, everything becomes load-bearing by default. Ignoring technical debt becomes easy because you aren't the one interacting with it directly, and AI isn't incentivized to take these detours on its own. The forgotten room behind the locked door? That's the module nobody remembers writing, walled off by newer code, still sitting in production.
The Winchester House had 47 fireplaces but only 17 chimneys. It was vast, ornate, and structurally incoherent. It demos beautifully — tourists have visited for over a century — but nobody could live in it as a functioning home. Sound familiar?
The lesson isn't "don't use AI to write code." The lesson is that unlimited resources without an accountability system produce the same result every time: something impressive from the outside that's uninhabitable from within. The tool isn't the problem. The absence of someone willing to say "this staircase goes nowhere, tear it out" is the problem.
Sarah Winchester had the resources, the labor, and the inspiration to build forever. What she didn't have was anyone who told her no. If your codebase has the same arrangement, you're not building software. You're building a mystery house that precipitously depreciates. Work with your team or with a properly guided AI agent to pause and consider the long-term cost when the immediate cost is zero.Psychological Safety Is the Killer Feature of AI Coding
It All Begins Here
There's an engineering system that I don't really understand. After over 15 years in the software engineering industry, and despite having been patiently taught by generous coworkers over the years, I still truly do not understand DNS. I'm honestly embarrassed to even say it here, but it's an example of a topic that carries baggage for me.
If you're a Director of Engineering, the unspoken expectation from everyone around you is that you already know. You've been doing this long enough, and you should have picked it up by now. So when I have questions about DNS or am unsure if the Cloudflare proxy should be on or off, I don't ask my team because I'm too bashful to either reveal that I don't know or have someone repeat themselves when I said I understood last time but I really didn't.
In 2012, Google's Project Aristotle identified psychological safety (feeling safe to take risks, speak up, admit mistakes, ask questions) as the single most important factor distinguishing high-performing teams from the rest. When coding with an AI agent, I don’t only benefit from the code it writes but also from the psychological safety to ask any question that I have. No fear of a raised eyebrow or recalibration of my competence. No "I already explained this." Just an answer, and then another if I need it rephrased, and another if I want to go deeper. Claude Code doesn’t have any expectations of me – I’m just a person trying to understand something.
That safety has made me more confident and more capable. I've learned more in the last year than in the previous five — not because the information wasn't available, but because the perceived relational and emotional cost of retrieving it dropped to zero. The information was always available to me from generous co-workers willing to explain, so the bottleneck was never access. It was the fear of what asking would say about me. I wonder how much knowledge goes unshared because someone is afraid to ask. Asking another human is always the richest experience, not only because you have an opportunity to connect but because there's also a good chance you'll teach them something in the process. But for the other times, the psychological safety of having Claude Code tell you for the fifth time how TTL caching works is a huge gift.
Potemkin Software
It All Begins Here
In 1787, Grigory Potemkin needed to impress Catherine the Great during her boat tour off the coast of Crimea. The popular version of the story claims that he made fake portable building facades and moved them around the island to make it look like it had beautiful, populated, established villages. Though this was likely exaggerated and perpetrated by his political rivals, the reality is almost more interesting: Potemkin did decorate real settlements, stage elaborate spectacles, and dress things up dramatically for the empress's visit. The buildings were real. They just weren't as finished as they looked.
I keep thinking about this story when I watch what's happening with AI-generated software. We're not building fake software. We're building real software that isn't what it appears to be. And most people can't tell the difference.
AI lets anyone build. You describe what you want, and something appears — a dashboard, a workflow tool, a prototype that looks like the real thing. As a proof of concept, this is extraordinary. Going from idea to working prototype in an afternoon, testing whether a concept has legs before investing months of engineering — that's a genuine superpower.
The problem isn't the proof of concept. The problem is that it no longer looks like one.
What AI excels at is generating the visible layer — the part that faces the user, the part that gets screenshotted and pasted into a pitch deck. A proof of concept used to look rough, obviously incomplete. Now it arrives with polished UI, smooth transitions, and a settings page.
In President Barack Obama's 2020 memoir A Promised Land, he identifies a similar dissonance in his writing. "I still like writing things out in longhand, finding that a computer gives even my roughest drafts too smooth a gloss and lends half-baked thoughts the mask of tidiness." He was talking about prose, but the observation scales. AI-generated software gives the roughest of drafts too smooth a gloss.
Two Biases That Compound Each Other
There's a concept called "completion bias" — the tendency to mistake the appearance of a finished thing for an actually finished thing. We see a polished UI and our brain checks the box. Done. Shipped. Next.
But there's a second force at work: the IKEA effect. People dramatically overvalue things they had a hand in building. You assemble a flat-pack bookshelf and you think it's beautiful — you don't notice the wobble because it's yours.
AI-generated software triggers both simultaneously. You prompted the tool. You described what you wanted. You iterated on it. By the time you have a working prototype, you feel like you built it — because you did. The IKEA effect tells you it's precious. The completion bias tells you it's done. Together they erode the discernment to sense the delta between proof of concept and production-ready product. That delta is where authentication edge cases, security and privacy vulnerabilities, error handling, data integrity, and graceful failure all live — the 90% of the work that a demo never shows you.
The emotional experience of creating something with AI feels exactly like the emotional experience of finishing something. And that feeling is a liar.
What Actually Helps
AI is extraordinarily good at proving concepts, and we should celebrate this. But we need a hard, bright line between proving a concept and shipping a product. The proof of concept is the beginning of the work, and not even close to the end.
There's an old adage in software: the first 80% of the work takes 20% of the time, and the last 20% takes the remaining 80%. With AI coding, this ratio has gotten more extreme – the first 90% now takes 5% of the time. The last 10% takes 95%. Plan for this — the speed of the prototype isn't evidence that production will be fast. It's a warning that the hardest work hasn't started.
Software engineers can capture this moment by taking the opportunity to educate newly empowered non-engineers on the complexity of building production-ready, safe, durable products. Non-engineers can learn to resist the urge to ship the prototype and instead treat it as what it is — a brilliant starting point that has revealed how much work remains. And everyone on the team can commit to asking the uncomfortable question before anything goes live: what happens when someone actually tries to live in this building?