跳转到内容

Turn 生命周期

Claude Code 中的一个”turn”是一次完整的来回交互:模型接收消息、产生响应(可能包含 tool 调用)、tool 执行、结果收集。本章追踪单个 turn 的完整生命周期,从消息构建到继续/停止决策。

sequenceDiagram
    participant QE as QueryEngine
    participant Q as queryLoop
    participant API as claude.ts
    participant TE as Tool Executor
    participant ATT as Attachments

    QE->>Q: for await (query({messages, ...}))
    Q->>Q: Prepare messagesForQuery
    Q->>API: callModel(messages, systemPrompt, tools, ...)
    API-->>Q: Stream assistant content blocks
    Q->>TE: Feed tool_use blocks
    TE-->>Q: Yield completed tool results (during stream)
    Note over API: Stream ends
    Q->>TE: getRemainingResults()
    TE-->>Q: Remaining tool results
    Q->>ATT: getAttachmentMessages(...)
    ATT-->>Q: File changes, memory, skill discovery
    Q->>Q: Continue decision
    Q-->>QE: Yield messages

system prompt 在循环开始前于 QueryEngine.submitMessage 中组装完成,然后在所有迭代中保持不变:

src/QueryEngine.ts
const { defaultSystemPrompt, userContext, systemContext } =
await fetchSystemPromptParts({
tools,
mainLoopModel: initialMainLoopModel,
additionalWorkingDirectories: Array.from(
initialAppState.toolPermissionContext.additionalWorkingDirectories.keys(),
),
mcpClients,
customSystemPrompt: customPrompt,
});
const systemPrompt = asSystemPrompt([
...(customPrompt !== undefined ? [customPrompt] : defaultSystemPrompt),
...(memoryMechanicsPrompt ? [memoryMechanicsPrompt] : []),
...(appendSystemPrompt ? [appendSystemPrompt] : []),
]);

system prompt 是一个分层结构

  1. 默认 system prompt自定义 system prompt(SDK 调用方)
  2. Memory mechanics prompt(设置了 CLAUDE_COWORK_MEMORY_PATH_OVERRIDE 时)
  3. 追加 system prompt(附加指令)

在 API 调用时,user context 和 system context 被注入:

// src/query.ts — inside the loop
const fullSystemPrompt = asSystemPrompt(
appendSystemContext(systemPrompt, systemContext)
);
// User context is prepended to the message array, not the system prompt
deps.callModel({
messages: prependUserContext(messagesForQuery, userContext),
systemPrompt: fullSystemPrompt,
// ...
});

发送给 API 的消息遵循 Anthropic Messages API 定义的严格结构:

[system prompt]
[user context block] ← prepended via prependUserContext()
[user message] ← the original prompt
[assistant message] ← model's response with tool_use blocks
[user message] ← tool_result blocks
[assistant message] ← model's next response
... ← repeating pattern
[user message] ← latest tool results + attachments

消息在发送给 API 之前通过 normalizeMessagesForAPI() 进行规范化,该函数会:

  • 剥离内部元数据字段
  • 确保用户/助手 turn 交替出现
  • 移除仅系统使用的消息
  • 处理 tool result 配对

若发生过 auto-compaction,则只发送最后一个 compact 边界之后的消息:

let messagesForQuery = [...getMessagesAfterCompactBoundary(messages)];

compact 边界是一条特殊的系统消息,标记 compaction 对旧历史的摘要位置。其之前的所有内容均被摘要替代。

API 调用使用模型特定参数进行配置:

src/query.ts
for await (const message of deps.callModel({
messages: prependUserContext(messagesForQuery, userContext),
systemPrompt: fullSystemPrompt,
thinkingConfig: toolUseContext.options.thinkingConfig,
tools: toolUseContext.options.tools,
signal: toolUseContext.abortController.signal,
options: {
model: currentModel,
fastMode: appState.fastMode,
fallbackModel,
querySource,
maxOutputTokensOverride,
agentId: toolUseContext.agentId,
effortValue: appState.effortValue,
taskBudget: params.taskBudget && {
total: params.taskBudget.total,
...(taskBudgetRemaining !== undefined && {
remaining: taskBudgetRemaining,
}),
},
},
}))

关键配置参数:

参数来源用途
modelgetRuntimeMainLoopModel()使用哪个 Claude 模型
thinkingConfig{ type: 'adaptive' }{ type: 'disabled' }扩展 thinking 控制
toolstoolUseContext.options.tools可用 tool 定义
maxOutputTokensOverride升级恢复期间设置覆盖默认 8K 上限
taskBudgetSDK 调用方配置服务端 token budget
effortValue用户 /effort 命令控制推理深度
fallbackModel配置主模型失败时尝试的模型

tool 通过 src/utils/api.ts 中的 toolToAPISchema() 转换为 API schema 格式。每个 tool 的 Zod schema 被转换为 API 所需的 JSON Schema:

// Simplified from src/utils/api.ts
function toolToAPISchema(tool: Tool): BetaToolUnion {
return {
name: tool.name,
description: await tool.description(input, options),
input_schema: tool.inputJSONSchema ?? zodToJsonSchema(tool.inputSchema),
};
}

API 响应以服务端发送事件序列的形式 stream 输出。claude.ts 模块将这些事件处理为类型化的 Message 对象:

graph LR
    A[SSE Events] --> B[claude.ts]
    B --> C[message_start]
    B --> D[content_block_start]
    B --> E[content_block_delta]
    B --> F[content_block_stop]
    B --> G[message_delta]
    B --> H[message_stop]

    C --> I[Reset usage counters]
    D --> J[Create AssistantMessage]
    E --> K[Stream text/tool_use deltas]
    F --> L[Yield completed block]
    G --> M[Capture stop_reason, final usage]
    H --> N[Accumulate total usage]

queryLoop 中的 stream 处理区分以下情况:

  • assistant 消息:推入 assistantMessages[],提取 tool_use 块
  • stream_event 消息:usage 追踪(message_startmessage_deltamessage_stop
  • 被扣押的错误prompt_too_longmax_output_tokens 被捕获但尚不 yield

在 yield 助手消息前,tool_use 输入会被回填以增强可观测性:

if (block.type === 'tool_use' && tool?.backfillObservableInput) {
const inputCopy = { ...originalInput };
tool.backfillObservableInput(inputCopy);
// Only clone when backfill ADDED fields (not overwrites)
const addedFields = Object.keys(inputCopy).some(k => !(k in originalInput));
if (addedFields) {
clonedContent ??= [...message.message.content];
clonedContent[i] = { ...block, input: inputCopy };
}
}

这为 hook 和 SDK 消费者添加了遗留/派生字段,而不会修改原始 API 绑定消息(否则会破坏 prompt cache)。

tool 执行完成后,结果被规范化以供 API 使用:

for await (const update of toolUpdates) {
if (update.message) {
yield update.message;
toolResults.push(
...normalizeMessagesForAPI(
[update.message],
toolUseContext.options.tools,
).filter(_ => _.type === 'user'),
);
}
}

每个 tool 结果成为一条包含 tool_result 内容块的 user 消息:

{
"role": "user",
"content": [
{
"type": "tool_result",
"tool_use_id": "toolu_01abc...",
"content": "File written successfully",
"is_error": false
}
]
}

错误结果将 is_error 设为 true 并将错误包裹在 <tool_use_error> 标签中。

token 追踪发生在两个层级:

// src/QueryEngine.ts — inside the for-await loop
case 'stream_event':
if (message.event.type === 'message_start') {
currentMessageUsage = EMPTY_USAGE;
currentMessageUsage = updateUsage(currentMessageUsage, message.event.message.usage);
}
if (message.event.type === 'message_delta') {
currentMessageUsage = updateUsage(currentMessageUsage, message.event.usage);
}
if (message.event.type === 'message_stop') {
this.totalUsage = accumulateUsage(this.totalUsage, currentMessageUsage);
}

src/cost-tracker.ts 中的 addToTotalSessionCost() 函数维护累计总量:

export function addToTotalSessionCost(
cost: number,
usage: Usage,
model: string,
): number {
const modelUsage = addToTotalModelUsage(cost, usage, model);
addToTotalCostState(cost, modelUsage, model);
// Also tracks advisor model usage recursively
}

API 每次响应返回以下 token 计数:

字段描述
input_tokensprompt 中的 token(未 cache)
output_tokens模型生成的 token
cache_creation_input_tokens写入 prompt cache 的 token
cache_read_input_tokens从 prompt cache 读取的 token

这些值在 NonNullableUsage 中累积:

src/services/api/logging.ts
export type NonNullableUsage = {
input_tokens: number;
output_tokens: number;
cache_creation_input_tokens: number;
cache_read_input_tokens: number;
};

最后一个阶段决定循环是继续还是终止。决策树如下:

graph TD
    A{needsFollowUp?} -- no --> B{Is API error?}
    B -- yes --> C{Recoverable?}
    C -- prompt_too_long --> D[Try collapse drain]
    D -- success --> E[Continue: collapse_drain_retry]
    D -- fail --> F[Try reactive compact]
    F -- success --> G[Continue: reactive_compact_retry]
    F -- fail --> H[Return error]
    C -- max_output_tokens --> I[Try escalate 8K→64K]
    I -- first time --> J[Continue: max_output_tokens_escalate]
    I -- already escalated --> K{Recovery count < 3?}
    K -- yes --> L[Inject resume message]
    L --> M[Continue: max_output_tokens_recovery]
    K -- no --> N[Surface error]
    B -- no --> O{Stop hooks?}
    O -- blocking --> P[Continue: stop_hook_blocking]
    O -- prevent --> Q[Return stop_hook_prevented]
    O -- pass --> R{Token budget?}
    R -- continue --> S[Continue: token_budget_continuation]
    R -- stop --> T[Return completed]
    A -- yes --> U{maxTurns reached?}
    U -- yes --> V[Return max_turns]
    U -- no --> W[Continue: next_turn]

turnCount 从 1 开始,每次 tool 结果产生后续跟进时递增:

// Each tool-result follow-up is a new turn
const nextTurnCount = turnCount + 1;
if (maxTurns && nextTurnCount > maxTurns) {
yield createAttachmentMessage({
type: 'max_turns_reached',
maxTurns,
turnCount: nextTurnCount,
});
return { reason: 'max_turns', turnCount: nextTurnCount };
}

在外层 QueryEngine 中,还有一个独立的 turnCount,每次从内层循环 yield 出 user 消息时递增:

if (message.type === 'user') {
turnCount++;
}

区别在于:内层 turnCount 统计 API 来回次数,外层统计用户可见的 turn 边界。