跳转到内容

Cost Tracking

Claude Code 追踪整个会话中的每一个 token 和每一分钱。本章介绍费用如何从 API 响应流经追踪系统,最终显示在 TUI 状态栏上。

graph TD
    A[API Response] --> B[claude.ts - message_delta event]
    B --> C[usage: input_tokens, output_tokens, cache_*]
    C --> D[addToTotalSessionCost]
    D --> E[addToTotalModelUsage]
    D --> F[addToTotalCostState - global state]
    D --> G[OpenTelemetry counters]
    F --> H[getTotalCostUSD]
    F --> I[getModelUsage]
    H --> J[QueryEngine budget check]
    H --> K[formatTotalCost - TUI display]
    I --> K

每个 API 响应包含一个带有以下字段的 usage 对象:

// From @anthropic-ai/sdk
type BetaUsage = {
input_tokens: number;
output_tokens: number;
cache_creation_input_tokens?: number;
cache_read_input_tokens?: number;
server_tool_use?: {
web_search_requests?: number;
};
};

各字段含义:

字段含义计费方式
input_tokens未 cache 的 prompt token标准输入费率
output_tokens生成的 token标准输出费率
cache_creation_input_tokens写入 prompt cache 的 token1.25× 输入费率
cache_read_input_tokens从 prompt cache 读取的 token0.1× 输入费率
web_search_requests服务端 web 搜索次数按请求计费

usage 在 stream 的两个时间点被捕获:

// src/services/api/claude.ts — inside queryModelWithStreaming
case 'message_start':
// Initial token count (just the input)
usage = event.message.usage;
break;
case 'message_delta':
// Final token count (after generation)
usage = updateUsage(usage, event.usage);
// event.usage has output_tokens and possibly updated cache tokens
break;

QueryEngine 中的 message_stop 事件触发费用累积:

src/QueryEngine.ts
if (message.event.type === 'message_stop') {
this.totalUsage = accumulateUsage(this.totalUsage, currentMessageUsage);
}

费用使用 src/utils/modelCost.ts 中定义的每模型定价计算:

src/utils/modelCost.ts
export type ModelCosts = {
inputTokens: number; // per 1M tokens
outputTokens: number; // per 1M tokens
promptCacheWriteTokens: number;
promptCacheReadTokens: number;
webSearchRequests: number; // per request
};
// Standard Sonnet pricing: $3/$15 per Mtok
export const COST_TIER_3_15 = {
inputTokens: 3,
outputTokens: 15,
promptCacheWriteTokens: 3.75,
promptCacheReadTokens: 0.3,
webSearchRequests: 0.01,
};
// Opus pricing: $15/$75 per Mtok
export const COST_TIER_15_75 = {
inputTokens: 15,
outputTokens: 75,
promptCacheWriteTokens: 18.75,
promptCacheReadTokens: 1.5,
webSearchRequests: 0.01,
};

calculateUSDCost 函数计算美元费用:

export function calculateUSDCost(model: string, usage: Usage): number {
const costs = getModelCosts(model);
if (!costs) {
setHasUnknownModelCost(true);
return 0; // unknown model → track as $0 but flag it
}
return (
(usage.input_tokens * costs.inputTokens) / 1_000_000 +
(usage.output_tokens * costs.outputTokens) / 1_000_000 +
((usage.cache_creation_input_tokens ?? 0) * costs.promptCacheWriteTokens) / 1_000_000 +
((usage.cache_read_input_tokens ?? 0) * costs.promptCacheReadTokens) / 1_000_000 +
((usage.server_tool_use?.web_search_requests ?? 0) * costs.webSearchRequests)
);
}

这是 src/cost-tracker.ts 中的中央累积入口:

export function addToTotalSessionCost(
cost: number,
usage: Usage,
model: string,
): number {
// 1. Accumulate per-model usage
const modelUsage = addToTotalModelUsage(cost, usage, model);
// 2. Update global state
addToTotalCostState(cost, modelUsage, model);
// 3. Emit to OpenTelemetry
getCostCounter()?.add(cost, attrs);
getTokenCounter()?.add(usage.input_tokens, { ...attrs, type: 'input' });
getTokenCounter()?.add(usage.output_tokens, { ...attrs, type: 'output' });
getTokenCounter()?.add(usage.cache_read_input_tokens ?? 0, { ...attrs, type: 'cacheRead' });
getTokenCounter()?.add(usage.cache_creation_input_tokens ?? 0, { ...attrs, type: 'cacheCreation' });
// 4. Recursively add advisor model usage
let totalCost = cost;
for (const advisorUsage of getAdvisorUsage(usage)) {
totalCost += addToTotalSessionCost(
calculateUSDCost(advisorUsage.model, advisorUsage),
advisorUsage,
advisorUsage.model,
);
}
return totalCost;
}

addToTotalModelUsage 函数维护每模型的累计总量:

function addToTotalModelUsage(cost: number, usage: Usage, model: string): ModelUsage {
const modelUsage = getUsageForModel(model) ?? {
inputTokens: 0, outputTokens: 0,
cacheReadInputTokens: 0, cacheCreationInputTokens: 0,
webSearchRequests: 0, costUSD: 0,
contextWindow: 0, maxOutputTokens: 0,
};
modelUsage.inputTokens += usage.input_tokens;
modelUsage.outputTokens += usage.output_tokens;
modelUsage.cacheReadInputTokens += usage.cache_read_input_tokens ?? 0;
modelUsage.cacheCreationInputTokens += usage.cache_creation_input_tokens ?? 0;
modelUsage.webSearchRequests += usage.server_tool_use?.web_search_requests ?? 0;
modelUsage.costUSD += cost;
modelUsage.contextWindow = getContextWindowForModel(model, getSdkBetas());
modelUsage.maxOutputTokens = getModelMaxOutputTokens(model).default;
return modelUsage;
}

所有费用计数器以模块级变量形式存储在 src/bootstrap/state.ts 中,为整个进程集中管理状态:

// src/bootstrap/state.ts (conceptual)
let totalCostUSD = 0;
let totalAPIDuration = 0;
let totalToolDuration = 0;
let totalLinesAdded = 0;
let totalLinesRemoved = 0;
let modelUsage: Record<string, ModelUsage> = {};

QueryEngine.submitMessage 中每次 yield 消息后执行:

src/QueryEngine.ts
if (maxBudgetUsd !== undefined && getTotalCost() >= maxBudgetUsd) {
yield {
type: 'result',
subtype: 'error_max_budget_usd',
is_error: true,
errors: [`Reached maximum budget ($${maxBudgetUsd})`],
total_cost_usd: getTotalCost(),
usage: this.totalUsage,
};
return;
}

task budget 传递给 API 并由服务端执行:

src/query.ts
taskBudget: params.taskBudget && {
total: params.taskBudget.total,
...(taskBudgetRemaining !== undefined && {
remaining: taskBudgetRemaining,
}),
},

remaining 字段随 token 消耗而递减。耗尽后,API 返回停止原因。

费用状态保存到项目配置中以支持恢复:

src/cost-tracker.ts
export function saveCurrentSessionCosts(fpsMetrics?: FpsMetrics): void {
saveCurrentProjectConfig(current => ({
...current,
lastCost: getTotalCostUSD(),
lastAPIDuration: getTotalAPIDuration(),
lastToolDuration: getTotalToolDuration(),
lastDuration: getTotalDuration(),
lastLinesAdded: getTotalLinesAdded(),
lastLinesRemoved: getTotalLinesRemoved(),
lastTotalInputTokens: getTotalInputTokens(),
lastTotalOutputTokens: getTotalOutputTokens(),
lastTotalCacheCreationInputTokens: getTotalCacheCreationInputTokens(),
lastTotalCacheReadInputTokens: getTotalCacheReadInputTokens(),
lastTotalWebSearchRequests: getTotalWebSearchRequests(),
lastModelUsage: Object.fromEntries(
Object.entries(getModelUsage()).map(([model, usage]) => [model, {
inputTokens: usage.inputTokens,
outputTokens: usage.outputTokens,
cacheReadInputTokens: usage.cacheReadInputTokens,
cacheCreationInputTokens: usage.cacheCreationInputTokens,
webSearchRequests: usage.webSearchRequests,
costUSD: usage.costUSD,
}]),
),
lastSessionId: getSessionId(),
}));
}

恢复会话时,费用会被加载回来:

export function restoreCostStateForSession(sessionId: string): boolean {
const data = getStoredSessionCosts(sessionId);
if (!data) return false;
setCostStateForRestore(data);
return true;
}

只有在 sessionId 匹配时才会恢复,防止跨会话的费用污染。

formatTotalCost() 函数生成退出时显示的费用摘要:

export function formatTotalCost(): string {
const costDisplay = formatCost(getTotalCostUSD()) +
(hasUnknownModelCost()
? ' (costs may be inaccurate due to usage of unknown models)'
: '');
return chalk.dim(
`Total cost: ${costDisplay}\n` +
`Total duration (API): ${formatDuration(getTotalAPIDuration())}\n` +
`Total duration (wall): ${formatDuration(getTotalDuration())}\n` +
`Total code changes: ${getTotalLinesAdded()} lines added, ${getTotalLinesRemoved()} lines removed\n` +
`${formatModelUsage()}`
);
}

每模型细分使用规范(简短)名称:

function formatModelUsage(): string {
const usageByShortName = {};
for (const [model, usage] of Object.entries(getModelUsage())) {
const shortName = getCanonicalName(model);
// Accumulate by short name (e.g., "sonnet" combines all sonnet variants)
usageByShortName[shortName].inputTokens += usage.inputTokens;
// ...
}
let result = 'Usage by model:';
for (const [shortName, usage] of Object.entries(usageByShortName)) {
result += `\n${shortName}: ${formatNumber(usage.inputTokens)} input, ` +
`${formatNumber(usage.outputTokens)} output, ` +
`${formatNumber(usage.cacheReadInputTokens)} cache read, ` +
`${formatNumber(usage.cacheCreationInputTokens)} cache write` +
` ($${usage.costUSD.toFixed(4)})`;
}
return result;
}

示例输出:

Total cost: $0.0847
Total duration (API): 45.2s
Total duration (wall): 1m 12s
Total code changes: 42 lines added, 7 lines removed
Usage by model:
sonnet: 125,432 input, 8,291 output, 98,100 cache read, 27,332 cache write ($0.0847)

formatCost 函数使用自适应精度:

function formatCost(cost: number, maxDecimalPlaces: number = 4): string {
return `$${cost > 0.5
? round(cost, 100).toFixed(2) // > $0.50 → 2 decimal places
: cost.toFixed(maxDecimalPlaces) // ≤ $0.50 → 4 decimal places
}`;
}
  • $0.0012 → 显示 4 位小数(廉价 query)
  • $3.47 → 显示 2 位小数(昂贵的会话)