Cost Tracking
Claude Code 追踪整个会话中的每一个 token 和每一分钱。本章介绍费用如何从 API 响应流经追踪系统,最终显示在 TUI 状态栏上。
graph TD
A[API Response] --> B[claude.ts - message_delta event]
B --> C[usage: input_tokens, output_tokens, cache_*]
C --> D[addToTotalSessionCost]
D --> E[addToTotalModelUsage]
D --> F[addToTotalCostState - global state]
D --> G[OpenTelemetry counters]
F --> H[getTotalCostUSD]
F --> I[getModelUsage]
H --> J[QueryEngine budget check]
H --> K[formatTotalCost - TUI display]
I --> K
每 Turn 的 Token 计数
Section titled “每 Turn 的 Token 计数”API 返回的 token 字段
Section titled “API 返回的 token 字段”每个 API 响应包含一个带有以下字段的 usage 对象:
// From @anthropic-ai/sdktype BetaUsage = { input_tokens: number; output_tokens: number; cache_creation_input_tokens?: number; cache_read_input_tokens?: number; server_tool_use?: { web_search_requests?: number; };};各字段含义:
| 字段 | 含义 | 计费方式 |
|---|---|---|
input_tokens | 未 cache 的 prompt token | 标准输入费率 |
output_tokens | 生成的 token | 标准输出费率 |
cache_creation_input_tokens | 写入 prompt cache 的 token | 1.25× 输入费率 |
cache_read_input_tokens | 从 prompt cache 读取的 token | 0.1× 输入费率 |
web_search_requests | 服务端 web 搜索次数 | 按请求计费 |
从 stream 事件中捕获 usage
Section titled “从 stream 事件中捕获 usage”usage 在 stream 的两个时间点被捕获:
// src/services/api/claude.ts — inside queryModelWithStreamingcase 'message_start': // Initial token count (just the input) usage = event.message.usage; break;
case 'message_delta': // Final token count (after generation) usage = updateUsage(usage, event.usage); // event.usage has output_tokens and possibly updated cache tokens break;QueryEngine 中的 message_stop 事件触发费用累积:
if (message.event.type === 'message_stop') { this.totalUsage = accumulateUsage(this.totalUsage, currentMessageUsage);}美元费用计算
Section titled “美元费用计算”费用使用 src/utils/modelCost.ts 中定义的每模型定价计算:
export type ModelCosts = { inputTokens: number; // per 1M tokens outputTokens: number; // per 1M tokens promptCacheWriteTokens: number; promptCacheReadTokens: number; webSearchRequests: number; // per request};
// Standard Sonnet pricing: $3/$15 per Mtokexport const COST_TIER_3_15 = { inputTokens: 3, outputTokens: 15, promptCacheWriteTokens: 3.75, promptCacheReadTokens: 0.3, webSearchRequests: 0.01,};
// Opus pricing: $15/$75 per Mtokexport const COST_TIER_15_75 = { inputTokens: 15, outputTokens: 75, promptCacheWriteTokens: 18.75, promptCacheReadTokens: 1.5, webSearchRequests: 0.01,};calculateUSDCost 函数计算美元费用:
export function calculateUSDCost(model: string, usage: Usage): number { const costs = getModelCosts(model); if (!costs) { setHasUnknownModelCost(true); return 0; // unknown model → track as $0 but flag it }
return ( (usage.input_tokens * costs.inputTokens) / 1_000_000 + (usage.output_tokens * costs.outputTokens) / 1_000_000 + ((usage.cache_creation_input_tokens ?? 0) * costs.promptCacheWriteTokens) / 1_000_000 + ((usage.cache_read_input_tokens ?? 0) * costs.promptCacheReadTokens) / 1_000_000 + ((usage.server_tool_use?.web_search_requests ?? 0) * costs.webSearchRequests) );}addToTotalSessionCost 函数
Section titled “addToTotalSessionCost 函数”这是 src/cost-tracker.ts 中的中央累积入口:
export function addToTotalSessionCost( cost: number, usage: Usage, model: string,): number { // 1. Accumulate per-model usage const modelUsage = addToTotalModelUsage(cost, usage, model);
// 2. Update global state addToTotalCostState(cost, modelUsage, model);
// 3. Emit to OpenTelemetry getCostCounter()?.add(cost, attrs); getTokenCounter()?.add(usage.input_tokens, { ...attrs, type: 'input' }); getTokenCounter()?.add(usage.output_tokens, { ...attrs, type: 'output' }); getTokenCounter()?.add(usage.cache_read_input_tokens ?? 0, { ...attrs, type: 'cacheRead' }); getTokenCounter()?.add(usage.cache_creation_input_tokens ?? 0, { ...attrs, type: 'cacheCreation' });
// 4. Recursively add advisor model usage let totalCost = cost; for (const advisorUsage of getAdvisorUsage(usage)) { totalCost += addToTotalSessionCost( calculateUSDCost(advisorUsage.model, advisorUsage), advisorUsage, advisorUsage.model, ); } return totalCost;}addToTotalModelUsage 函数维护每模型的累计总量:
function addToTotalModelUsage(cost: number, usage: Usage, model: string): ModelUsage { const modelUsage = getUsageForModel(model) ?? { inputTokens: 0, outputTokens: 0, cacheReadInputTokens: 0, cacheCreationInputTokens: 0, webSearchRequests: 0, costUSD: 0, contextWindow: 0, maxOutputTokens: 0, };
modelUsage.inputTokens += usage.input_tokens; modelUsage.outputTokens += usage.output_tokens; modelUsage.cacheReadInputTokens += usage.cache_read_input_tokens ?? 0; modelUsage.cacheCreationInputTokens += usage.cache_creation_input_tokens ?? 0; modelUsage.webSearchRequests += usage.server_tool_use?.web_search_requests ?? 0; modelUsage.costUSD += cost; modelUsage.contextWindow = getContextWindowForModel(model, getSdkBetas()); modelUsage.maxOutputTokens = getModelMaxOutputTokens(model).default;
return modelUsage;}全局状态(bootstrap/state.ts)
Section titled “全局状态(bootstrap/state.ts)”所有费用计数器以模块级变量形式存储在 src/bootstrap/state.ts 中,为整个进程集中管理状态:
// src/bootstrap/state.ts (conceptual)let totalCostUSD = 0;let totalAPIDuration = 0;let totalToolDuration = 0;let totalLinesAdded = 0;let totalLinesRemoved = 0;let modelUsage: Record<string, ModelUsage> = {};会话预算(maxBudgetUsd)
Section titled “会话预算(maxBudgetUsd)”在 QueryEngine.submitMessage 中每次 yield 消息后执行:
if (maxBudgetUsd !== undefined && getTotalCost() >= maxBudgetUsd) { yield { type: 'result', subtype: 'error_max_budget_usd', is_error: true, errors: [`Reached maximum budget ($${maxBudgetUsd})`], total_cost_usd: getTotalCost(), usage: this.totalUsage, }; return;}任务预算(服务端)
Section titled “任务预算(服务端)”task budget 传递给 API 并由服务端执行:
taskBudget: params.taskBudget && { total: params.taskBudget.total, ...(taskBudgetRemaining !== undefined && { remaining: taskBudgetRemaining, }),},remaining 字段随 token 消耗而递减。耗尽后,API 返回停止原因。
费用状态保存到项目配置中以支持恢复:
export function saveCurrentSessionCosts(fpsMetrics?: FpsMetrics): void { saveCurrentProjectConfig(current => ({ ...current, lastCost: getTotalCostUSD(), lastAPIDuration: getTotalAPIDuration(), lastToolDuration: getTotalToolDuration(), lastDuration: getTotalDuration(), lastLinesAdded: getTotalLinesAdded(), lastLinesRemoved: getTotalLinesRemoved(), lastTotalInputTokens: getTotalInputTokens(), lastTotalOutputTokens: getTotalOutputTokens(), lastTotalCacheCreationInputTokens: getTotalCacheCreationInputTokens(), lastTotalCacheReadInputTokens: getTotalCacheReadInputTokens(), lastTotalWebSearchRequests: getTotalWebSearchRequests(), lastModelUsage: Object.fromEntries( Object.entries(getModelUsage()).map(([model, usage]) => [model, { inputTokens: usage.inputTokens, outputTokens: usage.outputTokens, cacheReadInputTokens: usage.cacheReadInputTokens, cacheCreationInputTokens: usage.cacheCreationInputTokens, webSearchRequests: usage.webSearchRequests, costUSD: usage.costUSD, }]), ), lastSessionId: getSessionId(), }));}恢复会话时,费用会被加载回来:
export function restoreCostStateForSession(sessionId: string): boolean { const data = getStoredSessionCosts(sessionId); if (!data) return false; setCostStateForRestore(data); return true;}只有在 sessionId 匹配时才会恢复,防止跨会话的费用污染。
TUI 中的费用显示
Section titled “TUI 中的费用显示”formatTotalCost() 函数生成退出时显示的费用摘要:
export function formatTotalCost(): string { const costDisplay = formatCost(getTotalCostUSD()) + (hasUnknownModelCost() ? ' (costs may be inaccurate due to usage of unknown models)' : '');
return chalk.dim( `Total cost: ${costDisplay}\n` + `Total duration (API): ${formatDuration(getTotalAPIDuration())}\n` + `Total duration (wall): ${formatDuration(getTotalDuration())}\n` + `Total code changes: ${getTotalLinesAdded()} lines added, ${getTotalLinesRemoved()} lines removed\n` + `${formatModelUsage()}` );}每模型细分使用规范(简短)名称:
function formatModelUsage(): string { const usageByShortName = {}; for (const [model, usage] of Object.entries(getModelUsage())) { const shortName = getCanonicalName(model); // Accumulate by short name (e.g., "sonnet" combines all sonnet variants) usageByShortName[shortName].inputTokens += usage.inputTokens; // ... }
let result = 'Usage by model:'; for (const [shortName, usage] of Object.entries(usageByShortName)) { result += `\n${shortName}: ${formatNumber(usage.inputTokens)} input, ` + `${formatNumber(usage.outputTokens)} output, ` + `${formatNumber(usage.cacheReadInputTokens)} cache read, ` + `${formatNumber(usage.cacheCreationInputTokens)} cache write` + ` ($${usage.costUSD.toFixed(4)})`; } return result;}示例输出:
Total cost: $0.0847Total duration (API): 45.2sTotal duration (wall): 1m 12sTotal code changes: 42 lines added, 7 lines removedUsage by model: sonnet: 125,432 input, 8,291 output, 98,100 cache read, 27,332 cache write ($0.0847)费用格式化规则
Section titled “费用格式化规则”formatCost 函数使用自适应精度:
function formatCost(cost: number, maxDecimalPlaces: number = 4): string { return `$${cost > 0.5 ? round(cost, 100).toFixed(2) // > $0.50 → 2 decimal places : cost.toFixed(maxDecimalPlaces) // ≤ $0.50 → 4 decimal places }`;}$0.0012→ 显示 4 位小数(廉价 query)$3.47→ 显示 2 位小数(昂贵的会话)