Or: What happens when you give Claude Code a methodology and step back

I've been thinking about how we write with AI.

Most AI writing tools work the same way: you select text, click a button, and your words vanish—replaced by the AI's version. If you're lucky, you get a diff view. Red lines through your words. Green lines for theirs. It looks like a merge conflict, not a manuscript.

This isn't how writers think about revision.

When I edit a sentence, I want to audition phrasing. I want to see "the project ~~needs~~ requires clear rules" and read it as prose, not as a diagnostic. The strikethrough and the insertion should flow together. They should feel like editorial marks on a page, not Git showing me what changed between commits.

iA Writer understood this. Their Authorship feature lets you see who wrote what—human, AI, reference—through subtle color gradations. The metadata lives at the end of the file, invisible in the editor, stripped on export. No server calls. Everything local.

I wanted that, but for inline paraphrasing.

The Experiment #

Instead of building the thing, I decided to try something: feed Claude Code a methodology and see what happened.

The methodology was GitHub's Spec-Driven Development manifesto—412 lines defining a simple inversion: specifications don't serve code. Code serves specifications. Write your specs first, make them precise enough to be executable, then generate code as the "last mile."

I gave Claude Code three things:

Research on iA Writer's approach
A product requirements document with hard constraints
The SDD methodology document

Then I created a task structure with five steps: extract requirements, define contracts, create plans, write tests, review.

And I stepped back.

What Happened #

Two hours later, Claude Code had produced 4,710 lines of specifications across 20 files. Requirements documents. Data contracts. Implementation plans with rationale. Test scenarios.

Zero lines of implementation code.

The project was, in a meaningful sense, 90% complete—without a single function being written.

Here's what the timeline looked like:

16:52 — Initial commit. Research loaded.
17:25 — Research phase complete. A decision made.
17:40 — SDD methodology injected. Task structure created.
18:10 — Requirements extracted and organized.
18:17 — Data contracts defined.
18:22 — Implementation plans with technology choices.
18:27 — Test scenarios for all acceptance criteria.
18:31 — Review complete. Ambiguities resolved.
18:43 — Specifications finalized.
18:48 — Implementation task ready.

Each step built on the previous. Each commit was self-contained. The work progressed methodically, without me telling Claude Code what to do next.

The Transparency Choice #

One moment stands out.

During the research phase, Claude Code investigated what happens when you label content as AI-generated. The findings weren't encouraging. Studies from Mozilla Foundation, MIT, and others suggest that AI labels can actually undermine reader trust. People assume the labeled content is lower quality, even when it isn't.

The pragmatic choice would have been to skip the feature. Why build something that might make readers trust you less?

But I chose to implement it anyway.

Not because the research was wrong, but because transparency felt more important than optimization. If AI helped shape a sentence, readers deserve to know. Even if that knowledge makes them more skeptical. Especially then.

Claude Code recorded this decision: "IMPLEMENT despite trust concerns—transparency as values decision."

This is the kind of choice AI can't make for you. The methodology constrained how Claude Code worked, but the why remained human.

What Claude Code Did Autonomously #

Once the methodology was loaded and the scope defined, Claude Code applied the principles systematically:

Architecture emerged from requirements. From a single document, it identified five independent components. Each became its own specification with its own contracts and tests.

Ambiguities were marked, not assumed. When Claude Code encountered unclear requirements—how to handle punctuation, what happens when you edit at the boundary of AI text—it didn't guess. It marked them and resolved them in a review pass.

Technology choices came with rationale. Every library choice traced back to a requirement. Faster option over slower. Simpler over complex.

Over-engineering was prevented by constraint. The methodology included gates—checks that must pass before code is written. Too many dependencies? Rejected. Building for hypothetical future requirements? Rejected. The constraints kept the specs lean.

What I Did #

My role was smaller than I expected:

Chose the initial research direction
Made the transparency decision
Fed the methodology document
Approved the five-component architecture
Created the task structure

That's it. I didn't write specifications. I didn't choose technologies. I didn't design the data model. I reviewed what Claude Code produced and occasionally nodded.

The human functioned as architect and reviewer, not implementer. The creative work was defining the problem and the constraints. The execution was... mechanical? Systematic? Something that felt like it should take weeks, compressed into an afternoon.

The Paradox #

Here's what I keep thinking about: the project is "90% complete" with zero implementation code.

The specifications are precise enough that implementation becomes translation. The acceptance criteria are explicit. The test scenarios are written. The contracts are defined. What remains is the "last mile"—turning specs into code, then verifying the code against the specs.

This inverts everything I know about building software.

Traditionally, specs are scaffolding. You write them to guide development, then discard them when the "real work" of coding begins. The code becomes truth. The specs drift and die.

SDD says: specs are the asset. Code is disposable. If you need to change a requirement, you update the spec and regenerate. If you want to try a different approach, branch the spec and generate a parallel implementation. The code serves the specification, not the other way around.

I don't know if this is true. I haven't implemented paste-as-ai yet. The proof will come when I take these 4,710 lines of specs and try to make them run.

But watching Claude Code apply a methodology with this kind of rigor—watching it produce specs that feel more complete than what I'd write manually—something shifted in how I think about the work.

The Test #

The PRD included a simple success criterion:

If it feels like Git or Google Docs, you failed.

The inline paraphrasing should read as prose. The deletions and insertions should flow together. The experience should feel editorial, not technical.

This is the test for paste-as-ai. But maybe it's also the test for this way of working.

If the specifications are good enough, implementation should feel inevitable. The creative work is done. What remains is translation.

If the specifications are insufficient, implementation will reveal it. Edge cases will surface. Assumptions will break. And then the feedback loop: update the spec, regenerate the code, try again.

Either way, the spec remains the source of truth. The code just... expresses it.

That's 4,710 lines betting on this idea.

What Happens Next #

The implementation task is ready. Six components to build.

I'll document what happens when specs meet reality. Whether 4,710 lines of specifications translate to working code, or whether they expose gaps that two hours of autonomous specification couldn't anticipate.

For now, I have a project that's 90% complete and a question I can't answer yet:

Is this the future of building software? Or just an unusually rigorous way to procrastinate on actually writing code?

I'll let you know.

Addendum: The Last Mile #

Same day, 30 minutes later.

Implementation began. Two components—the comparison engine and the editor—went from specification to working code in about half an hour.

The numbers:

4,710 lines of specifications
1,020 lines of implementation
~30 minutes elapsed

A ratio of roughly 4.6 to 1. For every line of code, there were almost five lines of specification defining what it should do.

I watched it happen. The implementation felt... inevitable. Every function had a specification. Every edge case had a test scenario. The decisions were already made. What remained was transcription.

This is what SDD promised. I didn't know if I believed it until I saw it.

The project isn't finished. Three more components to go. But the hardest part wasn't writing code. It was writing specifications precise enough that the code could write itself.

That's the inversion. That's what two hours of specification buys you: thirty minutes of implementation.

🙏🙏🙏

Since you've made it this far, sharing this article on your favorite social media network would be highly appreciated 💖! For feedback, please ping me on Twitter.

The Two-Hour Specification