In active use · 2026

Building

A practice for shipping software without reading the code. The framework is what makes that work.

Opening

Building runs on one rule. I don't read the code. Claude Code executes; I never look at what it produced. The point isn't speed. The point is to find out whether judgment alone is enough — whether doing the thinking upstream and encoding it well enough to defer to is sufficient to ship real software.

It mostly is. But the discipline produces problems the discipline itself has to solve. Every time the rule surfaces a gap — a class of bug code review would have caught, a quality decay I would have eyeballed, a coordination failure I would have flagged in a diff — the framework has to absorb it. Building is what that absorption looks like across a few hundred logged decisions, built nights and weekends.

Building isn't an orchestration harness. It's a record of the judgment automation can't make, written down so the automation has something to defer to.

Two things follow from the rule. The framework gets tuned from what I learn in the experiment, not from what I think it should be. And the judgment work has to happen early. In the spec, in contracts, and controversy reviews that surface before Claude starts coding.

A controversy review table from one Nacre milestone. Five PRD decisions, each translated into the user complaint it would produce if it were wrong, with severity and recommended action. Build does not begin until the maker signs off on every row.

The framework grew under load

Three moments stand out. Each one is a place the rule produced a gap and the framework had to absorb the answer.

A version of the spec said "make it work on the web"

The first version of Nacre ran locally. When I decided to put it in front of users, I gave Claude Code a brief that said retrofit this for web hosting, with auth and user isolation. Claude Code did exactly that. The product ran on the web. People could sign up. Accounts worked.

Until I shipped a server update. Then everyone's data disappeared.

The brief hadn't said and persistent storage that survives a deploy. That requirement is so obvious to a human engineer that it doesn't feel like a requirement; it's just what hosting means. It wasn't obvious to the agent, because the agent didn't know what hosting means. It built faithfully to the spec it was given.

The framework grew a rule. Specs for hosted products have to enumerate the production realities the spec writer is taking for granted. Storage that survives. State that survives. Sessions that survive. The framework can't infer what the human assumes. Every assumption has to be written down or it isn't part of the build.

Bugs started showing up in unrelated places

Six milestones in a single morning, all UI on top of a stable substrate. Then a day of fixes where each fix surfaced two more issues somewhere else. Without reading the code, I couldn't trace why — only that something was getting fragile in a way the test suite wasn't catching.

The framework grew a senior development manager role with the authority to halt a milestone. It assesses fragility at every trigger point, not just at milestone boundaries. The default used to be ship more, faster. Now there's a second answer: stop building features and refactor the seams the next milestone is going to need.

The product was losing its visual language

Six consecutive milestones added UI. By the sixth, the same brain color lived in three different places — a CSS variable, a JavaScript color table, an inline style — none of them in agreement. Each individual drift was a small decision an agent made in isolation. Together they were a product that no longer knew what it looked like.

The framework grew a design specialist whose job is to enforce visual tokens across the codebase. Existing agents got rewired around it: the product agent describes intent only, never specific values; the task agent treats a hardcoded color as a scope violation when a token file exists.

What this looks like in practice

Building wasn't finished when Nacre started. I was defining the framework as I built the product. Each Nacre milestone was paced not just to explore the contours of Nacre but to find the shape of Building — what the framework needed to be, where its rules held, where they didn't.

Nacre got built this way. The longest continuous autonomous run was four hours; it produced six small bugs that took about twenty minutes to fix the next morning. Most milestones run shorter, most produce fewer bugs, and the pattern holds — the bugs that surface are usually small and same-day fixable.

The framework is on GitHub if you want to read it.

Closing

Building isn't slowing the work down. It's making the work legible and survivable.

There are real limits I haven't tested. Four hours of continuous autonomous build is the longest run so far. Whether the framework supports eight to ten hours is the next thing to find out.

Some of the specialists are still earning their place. The design agent works well on greenfield UI surfaces; on an established product with a settled visual language, it could do more harm than good. The framework records that as a judgment the maker still has to make per milestone — which overlays to apply, where the agent's reach is welcome, where it isn't.

Building is a working practice, not a finished framework. What's documented here is what's holding up today, including for this page. It will look different in three months because the constraint keeps producing problems faster than I can solve them — which is what the rule was meant to produce.

In active use. If you want to talk about it: get in touch →

Built with Building. The Nacre page is here.