Turn Lifecycle

此内容尚不支持你的语言。

A “turn” in Claude Code is a single round-trip: the model receives messages, produces a response (possibly with tool calls), tools execute, and results are collected. This chapter traces the complete lifecycle of one turn, from message construction to the continue/stop decision.

Turn Anatomy

sequenceDiagram
    participant QE as QueryEngine
    participant Q as queryLoop
    participant API as claude.ts
    participant TE as Tool Executor
    participant ATT as Attachments

    QE->>Q: for await (query({messages, ...}))
    Q->>Q: Prepare messagesForQuery
    Q->>API: callModel(messages, systemPrompt, tools, ...)
    API-->>Q: Stream assistant content blocks
    Q->>TE: Feed tool_use blocks
    TE-->>Q: Yield completed tool results (during stream)
    Note over API: Stream ends
    Q->>TE: getRemainingResults()
    TE-->>Q: Remaining tool results
    Q->>ATT: getAttachmentMessages(...)
    ATT-->>Q: File changes, memory, skill discovery
    Q->>Q: Continue decision
    Q-->>QE: Yield messages

Phase 1: Message Construction

System Prompt Assembly

The system prompt is assembled in QueryEngine.submitMessage before the loop begins, then held constant across iterations:

const { defaultSystemPrompt, userContext, systemContext } =
  await fetchSystemPromptParts({
    tools,
    mainLoopModel: initialMainLoopModel,
    additionalWorkingDirectories: Array.from(
      initialAppState.toolPermissionContext.additionalWorkingDirectories.keys(),
    ),
    mcpClients,
    customSystemPrompt: customPrompt,
  });

const systemPrompt = asSystemPrompt([
  ...(customPrompt !== undefined ? [customPrompt] : defaultSystemPrompt),
  ...(memoryMechanicsPrompt ? [memoryMechanicsPrompt] : []),
  ...(appendSystemPrompt ? [appendSystemPrompt] : []),
]);

The system prompt is a layered structure:

Default system prompt or a custom system prompt (SDK callers)
Memory mechanics prompt (when CLAUDE_COWORK_MEMORY_PATH_OVERRIDE is set)
Append system prompt (additional instructions)

At API call time, user context and system context are injected:

// src/query.ts — inside the loop
const fullSystemPrompt = asSystemPrompt(
  appendSystemContext(systemPrompt, systemContext)
);

// User context is prepended to the message array, not the system prompt
deps.callModel({
  messages: prependUserContext(messagesForQuery, userContext),
  systemPrompt: fullSystemPrompt,
  // ...
});

Message Array Structure

The messages sent to the API follow a strict structure defined by the Anthropic Messages API:

[system prompt]
[user context block]     ← prepended via prependUserContext()
[user message]           ← the original prompt
[assistant message]      ← model's response with tool_use blocks
[user message]           ← tool_result blocks
[assistant message]      ← model's next response
...                      ← repeating pattern
[user message]           ← latest tool results + attachments

Messages are normalized before being sent to the API via normalizeMessagesForAPI(), which:

Strips internal metadata fields
Ensures alternating user/assistant turns
Removes system-only messages
Handles tool result pairing

Compact Boundaries

If auto-compaction has occurred, only messages after the last compact boundary are sent:

let messagesForQuery = [...getMessagesAfterCompactBoundary(messages)];

The compact boundary is a special system message that marks where compaction summarized older history. Everything before it is replaced by the summary.

Phase 2: API Call Configuration

The API call is configured with model-specific parameters:

for await (const message of deps.callModel({
  messages: prependUserContext(messagesForQuery, userContext),
  systemPrompt: fullSystemPrompt,
  thinkingConfig: toolUseContext.options.thinkingConfig,
  tools: toolUseContext.options.tools,
  signal: toolUseContext.abortController.signal,
  options: {
    model: currentModel,
    fastMode: appState.fastMode,
    fallbackModel,
    querySource,
    maxOutputTokensOverride,
    agentId: toolUseContext.agentId,
    effortValue: appState.effortValue,
    taskBudget: params.taskBudget && {
      total: params.taskBudget.total,
      ...(taskBudgetRemaining !== undefined && {
        remaining: taskBudgetRemaining,
      }),
    },
  },
}))

Key configuration parameters:

Parameter	Source	Purpose
`model`	`getRuntimeMainLoopModel()`	Which Claude model to use
`thinkingConfig`	`{ type: 'adaptive' }` or `{ type: 'disabled' }`	Extended thinking control
`tools`	`toolUseContext.options.tools`	Available tool definitions
`maxOutputTokensOverride`	Set during escalation recovery	Override default 8K cap
`taskBudget`	SDK caller config	Server-side token budget
`effortValue`	User `/effort` command	Controls reasoning depth
`fallbackModel`	Config	Model to try if primary fails

Tool Schema Preparation

Tools are converted to API schema format in toolToAPISchema() from src/utils/api.ts. Each tool’s Zod schema is transformed into JSON Schema for the API:

// Simplified from src/utils/api.ts
function toolToAPISchema(tool: Tool): BetaToolUnion {
  return {
    name: tool.name,
    description: await tool.description(input, options),
    input_schema: tool.inputJSONSchema ?? zodToJsonSchema(tool.inputSchema),
  };
}

Phase 3: Stream Processing Pipeline

The API response streams as a sequence of server-sent events. The claude.ts module processes these into typed Message objects:

graph LR
    A[SSE Events] --> B[claude.ts]
    B --> C[message_start]
    B --> D[content_block_start]
    B --> E[content_block_delta]
    B --> F[content_block_stop]
    B --> G[message_delta]
    B --> H[message_stop]

    C --> I[Reset usage counters]
    D --> J[Create AssistantMessage]
    E --> K[Stream text/tool_use deltas]
    F --> L[Yield completed block]
    G --> M[Capture stop_reason, final usage]
    H --> N[Accumulate total usage]

The stream processing in queryLoop distinguishes between:

assistant messages: Pushed to assistantMessages[], tool_use blocks extracted
stream_event messages: Usage tracking (message_start, message_delta, message_stop)
Withheld errors: prompt_too_long and max_output_tokens are captured but NOT yielded yet

Backfill Observable Input

Before yielding assistant messages, tool_use inputs are backfilled for observability:

if (block.type === 'tool_use' && tool?.backfillObservableInput) {
  const inputCopy = { ...originalInput };
  tool.backfillObservableInput(inputCopy);
  // Only clone when backfill ADDED fields (not overwrites)
  const addedFields = Object.keys(inputCopy).some(k => !(k in originalInput));
  if (addedFields) {
    clonedContent ??= [...message.message.content];
    clonedContent[i] = { ...block, input: inputCopy };
  }
}

This adds legacy/derived fields for hooks and SDK consumers without mutating the original API-bound message (which would break prompt caching).

Phase 4: Tool Result Aggregation

After tool execution, results are normalized for the API:

for await (const update of toolUpdates) {
  if (update.message) {
    yield update.message;
    toolResults.push(
      ...normalizeMessagesForAPI(
        [update.message],
        toolUseContext.options.tools,
      ).filter(_ => _.type === 'user'),
    );
  }
}

Each tool result becomes a user message containing a tool_result content block:

{
  "role": "user",
  "content": [
    {
      "type": "tool_result",
      "tool_use_id": "toolu_01abc...",
      "content": "File written successfully",
      "is_error": false
    }
  ]
}

Error results set is_error: true and wrap the error in <tool_use_error> tags.

Phase 5: Token Counting and Budget Tracking

Token tracking happens at two levels:

Per-Message (Stream Events)

// src/QueryEngine.ts — inside the for-await loop
case 'stream_event':
  if (message.event.type === 'message_start') {
    currentMessageUsage = EMPTY_USAGE;
    currentMessageUsage = updateUsage(currentMessageUsage, message.event.message.usage);
  }
  if (message.event.type === 'message_delta') {
    currentMessageUsage = updateUsage(currentMessageUsage, message.event.usage);
  }
  if (message.event.type === 'message_stop') {
    this.totalUsage = accumulateUsage(this.totalUsage, currentMessageUsage);
  }

Session-Level (Cost Tracker)

The addToTotalSessionCost() function in src/cost-tracker.ts maintains running totals:

export function addToTotalSessionCost(
  cost: number,
  usage: Usage,
  model: string,
): number {
  const modelUsage = addToTotalModelUsage(cost, usage, model);
  addToTotalCostState(cost, modelUsage, model);
  // Also tracks advisor model usage recursively
}

Usage Fields

The API returns these token counts per response:

Field	Description
`input_tokens`	Tokens in the prompt (non-cached)
`output_tokens`	Tokens generated by the model
`cache_creation_input_tokens`	Tokens written to prompt cache
`cache_read_input_tokens`	Tokens read from prompt cache

These are accumulated in NonNullableUsage:

export type NonNullableUsage = {
  input_tokens: number;
  output_tokens: number;
  cache_creation_input_tokens: number;
  cache_read_input_tokens: number;
};

Phase 6: The Continue/Stop Decision

The final phase determines whether the loop should continue or terminate. The decision tree is:

graph TD
    A{needsFollowUp?} -- no --> B{Is API error?}
    B -- yes --> C{Recoverable?}
    C -- prompt_too_long --> D[Try collapse drain]
    D -- success --> E[Continue: collapse_drain_retry]
    D -- fail --> F[Try reactive compact]
    F -- success --> G[Continue: reactive_compact_retry]
    F -- fail --> H[Return error]
    C -- max_output_tokens --> I[Try escalate 8K→64K]
    I -- first time --> J[Continue: max_output_tokens_escalate]
    I -- already escalated --> K{Recovery count < 3?}
    K -- yes --> L[Inject resume message]
    L --> M[Continue: max_output_tokens_recovery]
    K -- no --> N[Surface error]
    B -- no --> O{Stop hooks?}
    O -- blocking --> P[Continue: stop_hook_blocking]
    O -- prevent --> Q[Return stop_hook_prevented]
    O -- pass --> R{Token budget?}
    R -- continue --> S[Continue: token_budget_continuation]
    R -- stop --> T[Return completed]
    A -- yes --> U{maxTurns reached?}
    U -- yes --> V[Return max_turns]
    U -- no --> W[Continue: next_turn]

Turn Count Tracking

The turnCount starts at 1 and increments each time tool results produce a follow-up:

// Each tool-result follow-up is a new turn
const nextTurnCount = turnCount + 1;

if (maxTurns && nextTurnCount > maxTurns) {
  yield createAttachmentMessage({
    type: 'max_turns_reached',
    maxTurns,
    turnCount: nextTurnCount,
  });
  return { reason: 'max_turns', turnCount: nextTurnCount };
}

In the outer QueryEngine, there’s a separate turnCount that increments on each user message yielded from the inner loop:

if (message.type === 'user') {
  turnCount++;
}

The difference: the inner turnCount counts API round-trips, while the outer one counts user-visible turn boundaries.