Skip to content

Evolution of Multi-Agent Patterns

Claude Code’s multi-agent system didn’t arrive fully formed. It evolved through distinct stages, each solving a specific limitation of the previous approach. Understanding this progression reveals the design pressures that shaped the current architecture.

graph LR
S1[Stage 1<br/>Single Agent] --> S2[Stage 2<br/>Named Sub-agents]
S2 --> S3[Stage 3<br/>Fork Mechanism]
S3 --> S4[Stage 4<br/>Coordinator Pattern]
S4 --> S5[Stage 5<br/>Team Swarm]
style S1 fill:#e1f5fe
style S2 fill:#b3e5fc
style S3 fill:#81d4fa
style S4 fill:#4fc3f7
style S5 fill:#29b6f6

The foundation. One Claude instance handles everything: reading code, planning, editing, running tests, and communicating with the user. The agentic loop (src/query.ts) drives all behavior.

What works: Simple, predictable, complete context about the task.

What breaks: Long-running tasks fill the context window. Switching between research and implementation wastes tokens. No parallelism — the user waits while the agent searches for one file at a time.

The first multi-agent step. The parent agent can spawn specialized sub-agents via the Agent tool, each with a defined role:

// Six built-in agents with distinct roles
const agents = [
GENERAL_PURPOSE_AGENT, // All tools, default model
EXPLORE_AGENT, // Read-only, fast model (haiku), omitClaudeMd
PLAN_AGENT, // Read-only, inherit model, architecture focus
VERIFICATION_AGENT, // Adversarial testing, background execution
CLAUDE_CODE_GUIDE_AGENT, // Documentation lookup, haiku model
STATUSLINE_SETUP_AGENT, // Status line config, sonnet model
]

Key design decisions:

  1. Model stratification. Not every agent needs the most capable model. Explore uses haiku for speed; Plan uses inherit for reasoning depth; Statusline uses sonnet for a mid-tier balance.

  2. Tool restriction. Read-only agents (Explore, Plan) achieve isolation through disallowedTools — no Edit, Write, or Agent tool — rather than a separate “read-only mode” flag.

  3. Token optimization. omitClaudeMd: true on read-only agents saves the CLAUDE.md hierarchy from their context. At scale (34M+ Explore spawns), this is significant.

  4. The priority override system. Custom agents from .claude/agents/ can override built-in agents, allowing project-specific customization:

built-in → plugin → user → project → flag → managed (policy)

What breaks: Each sub-agent starts with a blank slate. Research findings from an Explore agent are summarized back to the parent, which must re-explain them to an implementation agent. Context is lost in translation.

The fork solves the context loss problem by giving children the parent’s full conversation history:

// Fork: child inherits everything
const FORK_AGENT = {
tools: ['*'], // Same tools as parent
model: 'inherit', // Same model as parent
permissionMode: 'bubble', // Permissions surface to parent
getSystemPrompt: () => '', // Parent's rendered prompt is threaded directly
}

Critical innovation: Prompt cache sharing across forks. All fork children from the same parent turn share an identical API request prefix — only the per-child directive differs:

[shared history + identical placeholder results... | per-child directive]
↑ only this varies

What breaks: Forks are still ephemeral. They execute one task and report back. There’s no persistent team structure, no ongoing collaboration between agents.

The coordinator fundamentally restructures the interaction model. Instead of a powerful single agent that occasionally delegates, the main instance becomes an orchestrator that delegates all substantive work:

graph TD
subgraph "Stage 2: Parent dispatches"
P1[Parent Agent] -->|occasional delegation| S1[Sub-agent]
P1 -->|does most work itself| P1
end
subgraph "Stage 4: Coordinator orchestrates"
C[Coordinator] -->|all work delegated| W1[Worker 1]
C -->|all work delegated| W2[Worker 2]
C -->|all work delegated| W3[Worker 3]
C -.->|synthesize & direct| C
end

Key evolution: The coordinator’s primary job is synthesis, not execution. It reads worker findings, understands the problem, and writes precise implementation specs. The prompt explicitly forbids lazy delegation (“Based on your findings, fix it”).

Exclusive with forks. Coordinator mode and fork mode are mutually exclusive (isForkSubagentEnabled returns false in coordinator mode). They represent different orchestration philosophies:

  • Fork: “Clone myself with full context, let the clone handle a piece”
  • Coordinator: “I direct specialists. I understand everything. They execute.”

The swarm extends the coordinator concept with persistence and multiple execution backends:

FeatureCoordinatorTeam Swarm
MembersEphemeral workersPersistent teammates
BackendsSame processInProcess / Tmux / iTerm2
IsolationShared filesystemGit worktrees per member
IdentityTask IDsPersistent name@team IDs
StateWorker terminatesTeammates idle and wait
CommunicationTask notificationsBidirectional messages

The swarm adds:

  1. Persistent identity. Teammates have stable agentIds (format: name@team) that survive across interactions. The coordinator can continue a teammate via SendMessage without respawning.

  2. Multiple backends. In-process teammates share the Node.js process with AsyncLocalStorage isolation. Tmux and iTerm2 backends spawn fully independent processes in separate terminal panes.

  3. Git worktree isolation. Each teammate can get its own worktree — a lightweight copy of the repository where it can freely modify files without affecting other agents.

  4. Team file persistence. Team configurations survive session restarts, enabling long-running collaborative work.

Each stage is gated behind feature flags, allowing gradual rollout and A/B testing:

StageFeature FlagGate
Named sub-agentsBUILTIN_EXPLORE_PLAN_AGENTSGrowthBook tengu_amber_stoat
Verification agentVERIFICATION_AGENTGrowthBook tengu_hive_evidence
Fork mechanismFORK_SUBAGENTFeature flag + not coordinator + not SDK
CoordinatorCOORDINATOR_MODEFeature flag + env var

The GrowthBook integration allows Anthropic to:

  • A/B test each agent pattern independently
  • Measure impact on token usage, task completion, and user satisfaction
  • Roll back individual features without affecting others

The fundamental tension across all stages:

graph LR
subgraph "More Context"
A[Fork: Full parent history]
B[Continue via SendMessage]
end
subgraph "More Isolation"
C[Fresh spawn: Clean slate]
D[Worktree: Separate filesystem]
end
A ---|tradeoff| C
B ---|tradeoff| D
  • More context means better understanding but more tokens and potential for stale information
  • More isolation means cleaner execution but requires explicit context passing

The coordinator’s “continue vs. spawn” decision table encodes this tradeoff as practical heuristics:

High context overlap with the next task → continue the existing worker

Low context overlap → spawn fresh

Each stage in the evolution adds more sophisticated tools for navigating this fundamental tension. The progression is not one of replacement — all stages coexist. A user might run the simple single-agent mode for quick tasks, coordinator mode for complex features, and team swarm for long-running projects. The architecture supports this full spectrum.