ADR-013: Domain Skills for Agent Development

Status

Superseded by ADR-016 (2026-05-03). The mg-* domain skills retired; their content was extracted into docs/engineering/domains/<name>.md pages within the consolidated vault. This ADR is preserved for historical context — it documents the prior model and the reasoning that led to it.

Context

Agents working in the codebase rely on explore agents to understand domain structure before making changes. This is slow (~3-5 minutes per task) and error-prone — explore agents miss files, misunderstand patterns, and produce incomplete maps. We evaluated several approaches to give agents better context:

MCP docs server — a running process that serves documentation to agents via the MCP protocol
A new MkDocs site (docs-api/) — a third documentation site alongside sites/staff/ and sites/public/
Domain skills (.claude/skills/mg-*/SKILL.md) — markdown files in the existing skill system that agents read before working in a domain

We also considered whether external framework documentation (Stripe, Prisma, Supabase, etc.) needed a docs server or could be handled differently.

Decision

Domain skills over a docs site or MCP server. Each domain gets a skill file (mg-auth, mg-communications, etc.) containing file maps, data models, operations with side effects, patterns, testing guidance, and gotchas. Skills are a familiar shape — agents already know how to use them — and require no infrastructure.

External docs via llms.txt references in CLAUDE.md rather than an MCP server. Stripe, Prisma, Next.js, Supabase, Resend, Twilio, and Vercel all publish llms.txt indexes. Agents fetch the index, find the relevant page, and fetch that page. This is equivalent to a docs MCP server but with zero infrastructure.

App-wide conventions in CLAUDE.md, not repeated per skill. Route conventions (withErrorHandling, throw AppError vs return createErrorResponse Sentry distinction, auth patterns) belong in CLAUDE.md. Domain skills contain only domain-specific information.

Alternatives rejected

MCP docs server: Adds a running process to serve what file reads already do. The codebase isn't large enough for the retrieval benefits to outweigh the operational cost.
MkDocs site: Adds a build step, deployment, and maintenance surface for content that only agents consume. Skills are simpler and live closer to the code.
Auto-generated docs from TypeScript types / OpenAPI spec: The existing openapi.json was orphaned and stale. Generated docs go stale unless the generation tooling is actively maintained, which it wasn't.

Skill structure

Each skill covers both API and web app code for a domain:

Data model (tables, what lives where)
File map (routes, services, lib, hooks, components, tests)
Operations and side effects
External API surface we actually use (e.g., Supabase generateLink types)
Patterns (canonical example files, known inconsistencies)
Testing patterns and key test files
Gotchas

Validation methodology

Skills are validated by testing agents on realistic tasks and comparing against a control (no skills). The process is documented in the document-domain-skill skill:

Explore the domain (explore agent)
Write the skill
Validate every claim against source code (direct reads, no explore agents)
Test with an agent on a realistic task — measure tool calls and gaps
Run a control without skills — compare
Fix gaps and iterate

Proof-of-concept results

Tested with two tasks in the auth domain ("resend email verification link" and "impersonation timeout warnings"):

Metric	No Skills	With Skills
Research tool calls	38	12
Plan covers frontend + backend	Yes	Yes (after adding hooks/components to skill)
Correctness	Correct	Correct
Skill effectiveness	—	75-80% of needed info already in skill

Key findings:

Skills reduce research tool calls by 40-70% depending on how well the skill covers the task's surface area
Skills that only cover backend produce agents that skip frontend — both API and web app must be documented
External API details agents repeatedly stumble on (e.g., Supabase generateLink types) should be documented in the skill, not deferred to external docs
App-wide patterns (error handling, auth, response helpers) belong in CLAUDE.md to avoid duplication across skills
Pattern pointers should reference specific files, not prescriptive code skeletons, unless patterns are validated and standardized

Consequences

Benefits:

Agents orient 40-70% faster in documented domains
Fewer wrong-path explorations and missed files
Patterns section prevents agents from copying inconsistent code
Skills serve as refactoring references — they document what should be, not just what is
No infrastructure to maintain — skills are just markdown files

Tradeoffs:

Skills go stale when code changes. Requires discipline to update skills when modifying a domain.
Writing and validating a skill takes significant upfront effort (~1-2 hours per domain including testing)
Skills create mild tunnel vision — agents trust the skill's scope and may not look beyond it. Comprehensive file maps mitigate this.

Risks:

Stale skills could be worse than no skills if they point to renamed/deleted files or describe incorrect behavior
The mg-* naming convention assumes we won't need skills for non-Metrognome concerns (mitigated: other skills use plain names)