Spec-Driven Development
Last reviewed
AdvancedWhat you'll learn
~18 min- Decide when a feature deserves a spec instead of a conversation
- Run the Spec Kit workflow from constitution to implementation
- Use the spec as the agent-legible source of truth across sessions
Two developers. Same feature request: add CSV export to the reporting dashboard.
Developer A opens a chat and starts typing. Forty messages and two days later, the feature half-works, the schema differs from what the backend team agreed on, and nobody can remember which constraints were decided versus suggested. The agent implemented what the last message said, not what the project actually needs. Onboarding a second developer means scrolling a chat log.
Developer B runs /speckit.specify. In twenty minutes, she has a reviewed spec.md in version control — a document that answers “what are we building and why” well enough that any agent, any teammate, and future-her can re-read it and pick up exactly where the last session left off. She runs /speckit.plan, reviews the technical approach, runs /speckit.tasks, and the agent starts building against a stable target.
This is the difference between vibe coding at scale and spec-driven development.
Why specs beat conversations for big work
Module 10, Lesson 1 introduced the 60-second planning exercise — answer “what am I building, for whom, and what does done look like” before opening the terminal. Spec-driven development is that discipline systematized into committed artifacts.
Three properties make a spec better than a conversation thread for any feature that spans more than one session:
1. Context windows expire; files do not. A conversation is ephemeral. As Lesson 2 explains, quality degrades as the window fills, and a new session starts with zero memory of the old one. A spec file committed to the repo is re-read on every new session. The agent’s working memory becomes the file, not the chat scrollback.
2. Specs are team-reviewable.
A chat thread is a single-player medium. A spec.md in a PR is a first-class artifact your team can comment on, approve, or push back on — before any code is written. Design disagreements surface at the cheapest possible moment.
3. Specs are agent-legible.
Markdown with structured sections (goals, constraints, user stories, non-goals) is exactly what a language model reads well. Telling the agent @specs/042-csv-export/spec.md at the start of a session is more reliable than paraphrasing requirements in a new message and hoping the model reconstructs your intent correctly.
The Spec Kit workflow
Spec Kit (GitHub; ~111k stars; 30+ agents supported including Claude Code, Codex CLI, Antigravity CLI, Cursor, and Copilot) provides a set of slash commands that guide you through spec-driven development phase by phase. The commands below are verified against the project README as of June 2026.
Spec Kit is installed as a CLI extension via uv:
uv tool install specify-cli --from git+https://github.com/github/spec-kit.git@vX.Y.ZReplace vX.Y.Z with the latest release tag shown on the GitHub releases page. Once installed, the /speckit.* commands are available inside Claude Code, Codex CLI, Antigravity CLI, and the other supported agents.
The workflow runs in seven phases. Here is each one illustrated against the running example — adding CSV export to a reporting dashboard.
Phase 1 — Establish principles: /speckit.constitution
What it does: Creates or updates .specify/memory/constitution.md — a project-level document capturing governing principles: tech stack, architectural constraints, non-negotiables, team conventions.
Artifact: .specify/memory/constitution.md
You run this once when Spec Kit is introduced to a project, and update it when foundational decisions change. The constitution is read by every subsequent command; it ensures the plan the agent produces is consistent with the project’s actual constraints.
/speckit.constitutionThe agent prompts you with questions about your stack, conventions, and hard constraints. For a Node/TypeScript project: “TypeScript strict mode, no any; REST API using { data, error, meta } envelope; all monetary values in cents as integers; never touch src/legacy/v1-compat.ts.”
Phase 2 — Write the spec: /speckit.specify
What it does: Guides a structured conversation to produce a requirements document — the what and why of the feature, including user stories, acceptance criteria, and explicit non-goals.
Artifact: specs/{number}-{name}/spec.md
/speckit.specify> Feature: CSV export for the reporting dashboardPlausible output excerpt:
Creating spec: specs/042-csv-export/spec.md
Goals─────• Users can export any filtered report view to a UTF-8 CSV file• Export includes all visible columns in display order• Files are named report-{date}-{filter-hash}.csv
Non-goals─────────• No Excel (.xlsx) support in this iteration• No scheduled / email delivery of exports
Acceptance criteria───────────────────• Export button visible on all report views with > 0 rows• Large exports (> 10k rows) stream; do not buffer in memory• Column headers match the display labels, not the DB column namesCommit this file. It is now the source of truth. If the requirements change, the spec changes — not the chat.
Phase 3 — Resolve ambiguities: /speckit.clarify
What it does: Reads the spec and surfaces underspecified areas as explicit questions. Answers are written back into the spec as a Clarifications section.
Artifact: Clarifications section appended to specs/{number}-{name}/spec.md
/speckit.clarifyFor the CSV export, it might surface: “The spec says ‘all visible columns’ — does this include columns the user has hidden via column picker? Does it include computed columns not stored in the DB?” Running this before the technical plan prevents ambiguities from becoming defects.
Phase 4 — Technical plan: /speckit.plan
What it does: Reads the constitution and spec, then produces a technical implementation plan — data model changes, API shape, component breakdown, dependency decisions.
Artifact: specs/{number}-{name}/plan.md (and optionally data-model.md, api.md, etc.)
/speckit.planPlausible output excerpt:
Reading: .specify/memory/constitution.mdReading: specs/042-csv-export/spec.md
Plan: specs/042-csv-export/plan.md
API layer─────────GET /api/reports/:id/export?format=csv• Streams response using Node's Transform pipeline• Honors existing filter/sort query params• Sets Content-Disposition: attachment; filename="..."
Frontend────────• ExportButton component — single responsibility, no state• Triggers fetch with current filter params from ReportContext• Disables during in-flight request; re-enables on completionReview this plan before proceeding. This is the last cheap moment to change the approach.
Phase 5 — Task breakdown: /speckit.tasks
What it does: Decomposes the plan into an ordered, atomic task list — discrete units of work the agent (or a human) can execute and check off.
Artifact: specs/{number}-{name}/tasks.md
/speckit.tasksTasks are granular enough to be individually committed and independently verifiable. Example tasks: “Add GET /api/reports/:id/export route skeleton with auth middleware,” “Implement streaming CSV transform,” “Add ExportButton component,” “Write integration test: export 50k-row fixture, assert streaming.”
Phase 6 — Cross-check: /speckit.analyze
What it does: Reads the spec, plan, and tasks together and checks for consistency gaps — a task that has no corresponding spec requirement, a spec requirement with no covering task, or a plan decision that contradicts the constitution.
Artifact: Analysis output (inline; flagged items you address before implementing)
/speckit.analyzeRunning this before implementation is inexpensive insurance. A five-minute analysis that surfaces “the spec requires streaming for > 10k rows but no task covers the streaming transform” is worth considerably more than discovering the gap in code review.
Phase 7 — Execute: /speckit.implement
What it does: Works through tasks.md in order, implementing each task against the plan. Because the agent reads the spec and plan at the start of each task, it has a stable target — it is not reconstructing requirements from conversation memory.
Artifact: Working implementation
/speckit.implementIf you are running a multi-session project, this is where the investment pays back: open a new session, run /speckit.implement, point the agent at @specs/042-csv-export/, and it re-reads the spec and picks up from the next unchecked task. No context ramp-up. No re-explaining.
The README also confirms two additional commands not in the core flow above:
/speckit.checklist— generates a custom quality checklist for the feature (useful for pre-PR review gates)/speckit.taskstoissues— convertstasks.mdinto GitHub Issues, one issue per task
Use them when your workflow benefits. Neither is required for the core spec-driven loop.
Living with specs
The spec is not a planning document you throw away after kick-off. It is the durable source of truth for the life of the feature.
Commit it early. The spec, plan, and tasks belong in version control alongside the code. A PR that includes specs/042-csv-export/ before implementation starts gives reviewers a chance to push back on requirements and approach — not behavior.
The agent re-reads the spec each session. At the start of any new session working on this feature, reference the spec explicitly:
@specs/042-csv-export/spec.md @specs/042-csv-export/plan.mdContinue from the next unchecked task in tasks.md.The agent now has the same baseline it had at the start of the previous session. This is what Lesson 2 calls persistent, agent-legible working memory — a file the agent re-reads every session beats re-explaining in every prompt.
When reality diverges, update the spec first. If a mid-implementation discovery changes the approach — the streaming library you planned on has a breaking API, the backend team changed the filter contract — update spec.md and plan.md before changing the code. The spec is the source of truth, not the chat scrollback. A spec that drifts from the code is a spec that stops being useful.
Use spec drift as a code review signal. If a PR changes behavior that the spec does not mention, that is either a spec that needs updating or a change that needs justification. Either way, the spec makes the gap visible.
When NOT to spec
Spec-driven development is an investment. The ceremony earns its cost when the feature is large enough that the discipline compounds across sessions and team members. It does not pay for itself on:
- Single-session fixes. “Change the button color to
#3b82f6” — run it, ship it, done. No spec needed. - Throwaway prototypes. If the goal is to discover whether something is worth building, a spec is premature. Prototype first, spec the keeper.
- Isolated, well-understood changes. “Add a
createdAtfield to the User model” — if the whole change fits in one commit with no ambiguity, the spec costs more than it returns.
The decision heuristic: if “what are we building” will still be a live question in 48 hours, write a spec.
The spec-kit-llm-council extension
spec-kit-llm-council (MIT; v0.3.x) is an extension to Spec Kit built by the author of this course. It is early-stage and optional — core Spec Kit works without it.
What it adds: Three lifecycle gates that pause the workflow and convene a multi-model council — a panel of LLMs with different training data, different priors, and different failure modes — to review your spec, plan, and tasks before coding starts. The premise: one model misses things that a panel catches.
The three gates run after /speckit.specify, before /speckit.tasks, and before /speckit.implement. Verdicts are advisory only — nothing blocks the next Spec Kit command. The results are written to .specify/council/{feature}/{gate}-review.md as a durable audit trail.
Install:
uv tool install llm-councilspecify extension add llm-councilDry-run before spending tokens:
speckit.llm-council.dry-runExample output:
[council] dry-run for 042-csv-export — gate: plan (mode: plan)[council] participants: claude, codex, gemini, deepseek_v4_pro[council] estimated cost: $0.0017 (cap: $0.50) ✓ under capRecall the last verdict:
speckit.llm-council.last[council] last verdict for 042-csv-export — yes (4/4)[council] Recorded: 2026-05-04T05:04:34ZThe council pattern — routing a decision through multiple models and synthesizing their verdicts — is covered in depth in the next lesson: Lesson 8: The Council Pattern.
When Things Go Wrong
Use the Symptom → Evidence → Request pattern: describe what you see, paste the error, then ask for a fix.
A teammate asks you to add a small search filter to an existing page — it is a two-hour task you have done variations of before. Should you write a spec?
Which Spec Kit phase is specifically designed to surface underspecified requirements before any technical planning begins?
A new session opens on day three of implementing a multi-session feature. The agent has no memory of the previous two days. Where does the authoritative definition of what you are building live?
Key takeaways
- Specs beat conversations for multi-session work. Context windows expire; committed files do not. The spec is the agent-legible source of truth that outlives any single session.
- The Spec Kit workflow: constitution (project principles) → specify (what/why) → clarify (resolve ambiguities) → plan (technical approach) → tasks (atomic work units) → analyze (cross-check coverage) → implement (execute against stable target).
- Commit the artifacts.
spec.md,plan.md, andtasks.mdbelong in version control. A PR that includes the spec before the code gives reviewers the cheapest possible opportunity to push back. - When reality diverges, update the spec first. The spec is the source of truth, not the chat. Drift between spec and code is a signal, not a feature.
- Skip the ceremony for small work. If the task fits in one session with no ambiguity, prompt directly. Reserve spec-driven development for features where “what are we building” will still be a live question in 48 hours.
Next up: Lesson 8: The Council Pattern — routing decisions through a panel of models and synthesizing verdicts before committing to a path.