Specification-driven engineering context

A governed sandbox for AI coding agents.

Spellbook turns requirements, architecture, tests, and local conventions into enforceable context for coding agents.

The problem

01Agents are fast.They can edit, refactor, test, and ship faster than teams can review.
02Codebases are fragile.Architecture, security, and domain rules are easy to violate.
03Prompts do not govern systems.Important constraints disappear inside chat history and tribal knowledge.
04Context must be executable.Requirements, tests, and conventions need to become gates, not suggestions.

Without executable specifications, AI-generated code drifts from product intent, architecture boundaries, security rules, and team conventions.

From prompt to proof.

Spellbook turns an agent run into a governed task record: spec, plan, code, tests, and evidence stay connected.

  • Spec
    requirement pack:
    AUTH-FUNC-002
    AUTH-SEC-004
    
    - architecture
    - local conventions
    - context
    - session boundary
  • Plan
    1. load invariants
    2. inspect routes
    3. update service
    4. add tests
    5. produce report
  • Code
    changed files:
    auth.routes.ts
    session.store.ts
    login.spec.ts
    me.spec.ts
  • Tests
    npm test
    
    18 passed
    0 failed
    0 skipped
  • Evidence
    review_ready
    requirements 7/8
    forbidden shortcuts 0
    human review required

Built for codebases where AI mistakes are expensive.

Spellbook is for teams adopting coding agents in systems where architecture, security, compliance, and correctness cannot be left to prompt memory.

Staff Engineers

Keep agents aligned with architecture, invariants, and repo conventions.

Engineering Leaders

Adopt AI codegen with governance, auditability, and measurable quality signals.

Platform Teams

Standardize agent workflows across repos, stacks, and delivery gates.

Indie Builders

Use AI speed without losing control of product intent and code structure.

Regulated Teams

Preserve evidence for requirements, tests, approvals, and releases.

From intent to verified change.

The governed loop keeps agent work connected to executable requirements, architecture constraints, delivery gates, and review evidence.

  • Init

    Create the workspace contract, repo boundary, and allowed agent tools.

  • Specify

    Capture product intent, domain truth, requirements, and acceptance checks.

  • Architecture

    Map components, ownership, integration patterns, and system constraints.

  • Local Conventions

    Load repo-specific rules for naming, errors, tests, logging, and layout.

  • Plan

    Generate an implementation plan before code changes begin.

  • Build

    Run the agent in an isolated workspace with controlled permissions.

  • Test

    Execute required checks and attach results to the task record.

  • Review

    Compare the diff against requirements, architecture, and conventions.

  • Ship

    Promote only changes that satisfy delivery gates.

  • Verify

    Run post-merge or environment-level validation.

  • Monitor

    Track runtime behavior, failures, latency, cost, and quality signals.

  • Learn

    Feed discoveries back into specs, requirements, and future plans.

See it on a real boundary: auth-v0.

A small authentication system is enough to show why prompt-first codegen breaks down.

Vague Prompt

Build login.

Spellbook Spec

Intent:
Users can register and log in.

Domain:
User, Session

States:
User: Active, Disabled
Session: Active, Revoked, Expired

Invariants:
DisabledUserCannotLogin
SessionHasExpiry
PasswordHashNeverReturned
OnlyActiveSessionMayAuthorizeProtectedRoute

Output

Generated:
POST /register
POST /login
POST /logout
GET  /me

Evidence:
8 tests passed
4 requirements satisfied
0 forbidden shortcuts detected
1 review note created

Keep reading off the landing page.

Manifesto and docs now live as real offline pages with the same brutalist system and local theme state.

Manifesto

Why agentic codegen needs governed execution, not better prompt memory.

Docs

Preview task records, packs, gates, commands, and evidence surfaces.

Project Seed

Turn a rough software idea into structured project truth before agent execution begins.