Specification-driven engineering context

A governed sandbox for AI coding agents.

Spellbook turns requirements, architecture, tests, and local conventions into enforceable context for coding agents.

View auth-v0 demo

agent-workspace / review-ready

status: review ready tests: 18/18 passed requirements: 7/8 shortcuts: 0

01Agents are fast.They can edit, refactor, test, and ship faster than teams can review.

02Codebases are fragile.Architecture, security, and domain rules are easy to violate.

03Prompts do not govern systems.Important constraints disappear inside chat history and tribal knowledge.

04Context must be executable.Requirements, tests, and conventions need to become gates, not suggestions.

Without executable specifications, AI-generated code drifts from product intent, architecture boundaries, security rules, and team conventions.

From prompt to proof.

Spellbook turns an agent run into a governed task record: spec, plan, code, tests, and evidence stay connected.

Spec

requirement pack:
AUTH-FUNC-002
AUTH-SEC-004

- architecture
- local conventions
- context
- session boundary

Plan

1. load invariants
2. inspect routes
3. update service
4. add tests
5. produce report

Code

changed files:
auth.routes.ts
session.store.ts
login.spec.ts
me.spec.ts

Tests
```
npm test

18 passed
0 failed
0 skipped
```

Evidence

review_ready
requirements 7/8
forbidden shortcuts 0
human review required

Built for codebases where AI mistakes are expensive.

Spellbook is for teams adopting coding agents in systems where architecture, security, compliance, and correctness cannot be left to prompt memory.

Staff Engineers

Keep agents aligned with architecture, invariants, and repo conventions.

Engineering Leaders

Adopt AI codegen with governance, auditability, and measurable quality signals.

Platform Teams

Standardize agent workflows across repos, stacks, and delivery gates.

Indie Builders

Use AI speed without losing control of product intent and code structure.

Regulated Teams

Preserve evidence for requirements, tests, approvals, and releases.

From intent to verified change.

The governed loop keeps agent work connected to executable requirements, architecture constraints, delivery gates, and review evidence.

Init
Create the workspace contract, repo boundary, and allowed agent tools.
Specify
Capture product intent, domain truth, requirements, and acceptance checks.
Architecture
Map components, ownership, integration patterns, and system constraints.
Local Conventions
Load repo-specific rules for naming, errors, tests, logging, and layout.
Plan
Generate an implementation plan before code changes begin.
Build
Run the agent in an isolated workspace with controlled permissions.
Test
Execute required checks and attach results to the task record.
Review
Compare the diff against requirements, architecture, and conventions.
Ship
Promote only changes that satisfy delivery gates.
Verify
Run post-merge or environment-level validation.
Monitor
Track runtime behavior, failures, latency, cost, and quality signals.
Learn
Feed discoveries back into specs, requirements, and future plans.

See it on a real boundary: auth-v0.

A small authentication system is enough to show why prompt-first codegen breaks down.

Vague Prompt

Build login.

Spellbook Spec

Intent:
Users can register and log in.

Domain:
User, Session

States:
User: Active, Disabled
Session: Active, Revoked, Expired

Invariants:
DisabledUserCannotLogin
SessionHasExpiry
PasswordHashNeverReturned
OnlyActiveSessionMayAuthorizeProtectedRoute

Output

Generated:
POST /register
POST /login
POST /logout
GET  /me

Evidence:
8 tests passed
4 requirements satisfied
0 forbidden shortcuts detected
1 review note created

Github auth-v0

Keep reading off the landing page.

Manifesto and docs now live as real offline pages with the same brutalist system and local theme state.

Manifesto

Why agentic codegen needs governed execution, not better prompt memory.

Read manifesto

Docs

Preview task records, packs, gates, commands, and evidence surfaces.

Read docs

Project Seed

Turn a rough software idea into structured project truth before agent execution begins.

Create seed