Cost Tracking
此内容尚不支持你的语言。
Claude Code tracks every token and every dollar across the entire session. This chapter covers how costs flow from API responses through the tracking system to the TUI status line.
Architecture Overview
Section titled “Architecture Overview”graph TD A[API Response] --> B[claude.ts - message_delta event] B --> C[usage: input_tokens, output_tokens, cache_*] C --> D[addToTotalSessionCost] D --> E[addToTotalModelUsage] D --> F[addToTotalCostState - global state] D --> G[OpenTelemetry counters] F --> H[getTotalCostUSD] F --> I[getModelUsage] H --> J[QueryEngine budget check] H --> K[formatTotalCost - TUI display] I --> KPer-Turn Token Counting
Section titled “Per-Turn Token Counting”Token Fields from the API
Section titled “Token Fields from the API”Each API response includes a usage object with these fields:
// From @anthropic-ai/sdktype BetaUsage = { input_tokens: number; output_tokens: number; cache_creation_input_tokens?: number; cache_read_input_tokens?: number; server_tool_use?: { web_search_requests?: number; };};The fields map to:
| Field | Meaning | Billing |
|---|---|---|
input_tokens | Non-cached prompt tokens | Standard input rate |
output_tokens | Generated tokens | Standard output rate |
cache_creation_input_tokens | Tokens written to prompt cache | 1.25× input rate |
cache_read_input_tokens | Tokens read from prompt cache | 0.1× input rate |
web_search_requests | Server-side web searches | Per-request fee |
Capturing Usage from Stream Events
Section titled “Capturing Usage from Stream Events”Usage is captured at two points in the stream:
// src/services/api/claude.ts — inside queryModelWithStreamingcase 'message_start': // Initial token count (just the input) usage = event.message.usage; break;
case 'message_delta': // Final token count (after generation) usage = updateUsage(usage, event.usage); // event.usage has output_tokens and possibly updated cache tokens break;The message_stop event in QueryEngine triggers cost accumulation:
if (message.event.type === 'message_stop') { this.totalUsage = accumulateUsage(this.totalUsage, currentMessageUsage);}Dollar Cost Calculation
Section titled “Dollar Cost Calculation”Model Pricing Tables
Section titled “Model Pricing Tables”Costs are calculated using per-model pricing defined in src/utils/modelCost.ts:
export type ModelCosts = { inputTokens: number; // per 1M tokens outputTokens: number; // per 1M tokens promptCacheWriteTokens: number; promptCacheReadTokens: number; webSearchRequests: number; // per request};
// Standard Sonnet pricing: $3/$15 per Mtokexport const COST_TIER_3_15 = { inputTokens: 3, outputTokens: 15, promptCacheWriteTokens: 3.75, promptCacheReadTokens: 0.3, webSearchRequests: 0.01,};
// Opus pricing: $15/$75 per Mtokexport const COST_TIER_15_75 = { inputTokens: 15, outputTokens: 75, promptCacheWriteTokens: 18.75, promptCacheReadTokens: 1.5, webSearchRequests: 0.01,};The Cost Formula
Section titled “The Cost Formula”The calculateUSDCost function computes the dollar cost:
export function calculateUSDCost(model: string, usage: Usage): number { const costs = getModelCosts(model); if (!costs) { setHasUnknownModelCost(true); return 0; // unknown model → track as $0 but flag it }
return ( (usage.input_tokens * costs.inputTokens) / 1_000_000 + (usage.output_tokens * costs.outputTokens) / 1_000_000 + ((usage.cache_creation_input_tokens ?? 0) * costs.promptCacheWriteTokens) / 1_000_000 + ((usage.cache_read_input_tokens ?? 0) * costs.promptCacheReadTokens) / 1_000_000 + ((usage.server_tool_use?.web_search_requests ?? 0) * costs.webSearchRequests) );}Session-Level Accumulation
Section titled “Session-Level Accumulation”The addToTotalSessionCost Function
Section titled “The addToTotalSessionCost Function”This is the central accumulation point in src/cost-tracker.ts:
export function addToTotalSessionCost( cost: number, usage: Usage, model: string,): number { // 1. Accumulate per-model usage const modelUsage = addToTotalModelUsage(cost, usage, model);
// 2. Update global state addToTotalCostState(cost, modelUsage, model);
// 3. Emit to OpenTelemetry getCostCounter()?.add(cost, attrs); getTokenCounter()?.add(usage.input_tokens, { ...attrs, type: 'input' }); getTokenCounter()?.add(usage.output_tokens, { ...attrs, type: 'output' }); getTokenCounter()?.add(usage.cache_read_input_tokens ?? 0, { ...attrs, type: 'cacheRead' }); getTokenCounter()?.add(usage.cache_creation_input_tokens ?? 0, { ...attrs, type: 'cacheCreation' });
// 4. Recursively add advisor model usage let totalCost = cost; for (const advisorUsage of getAdvisorUsage(usage)) { totalCost += addToTotalSessionCost( calculateUSDCost(advisorUsage.model, advisorUsage), advisorUsage, advisorUsage.model, ); } return totalCost;}The addToTotalModelUsage function maintains per-model running totals:
function addToTotalModelUsage(cost: number, usage: Usage, model: string): ModelUsage { const modelUsage = getUsageForModel(model) ?? { inputTokens: 0, outputTokens: 0, cacheReadInputTokens: 0, cacheCreationInputTokens: 0, webSearchRequests: 0, costUSD: 0, contextWindow: 0, maxOutputTokens: 0, };
modelUsage.inputTokens += usage.input_tokens; modelUsage.outputTokens += usage.output_tokens; modelUsage.cacheReadInputTokens += usage.cache_read_input_tokens ?? 0; modelUsage.cacheCreationInputTokens += usage.cache_creation_input_tokens ?? 0; modelUsage.webSearchRequests += usage.server_tool_use?.web_search_requests ?? 0; modelUsage.costUSD += cost; modelUsage.contextWindow = getContextWindowForModel(model, getSdkBetas()); modelUsage.maxOutputTokens = getModelMaxOutputTokens(model).default;
return modelUsage;}Global State (bootstrap/state.ts)
Section titled “Global State (bootstrap/state.ts)”All cost counters live in src/bootstrap/state.ts as module-level variables. This centralizes state for the entire process:
// src/bootstrap/state.ts (conceptual)let totalCostUSD = 0;let totalAPIDuration = 0;let totalToolDuration = 0;let totalLinesAdded = 0;let totalLinesRemoved = 0;let modelUsage: Record<string, ModelUsage> = {};Budget Enforcement
Section titled “Budget Enforcement”Session Budget (maxBudgetUsd)
Section titled “Session Budget (maxBudgetUsd)”Enforced in QueryEngine.submitMessage after each yielded message:
if (maxBudgetUsd !== undefined && getTotalCost() >= maxBudgetUsd) { yield { type: 'result', subtype: 'error_max_budget_usd', is_error: true, errors: [`Reached maximum budget ($${maxBudgetUsd})`], total_cost_usd: getTotalCost(), usage: this.totalUsage, }; return;}Task Budget (Server-Side)
Section titled “Task Budget (Server-Side)”The task budget is passed to the API and enforced server-side:
taskBudget: params.taskBudget && { total: params.taskBudget.total, ...(taskBudgetRemaining !== undefined && { remaining: taskBudgetRemaining, }),},The remaining field is decremented as tokens are consumed. When exhausted, the API returns a stop reason.
Session Persistence
Section titled “Session Persistence”Cost state is saved to project config for resume support:
export function saveCurrentSessionCosts(fpsMetrics?: FpsMetrics): void { saveCurrentProjectConfig(current => ({ ...current, lastCost: getTotalCostUSD(), lastAPIDuration: getTotalAPIDuration(), lastToolDuration: getTotalToolDuration(), lastDuration: getTotalDuration(), lastLinesAdded: getTotalLinesAdded(), lastLinesRemoved: getTotalLinesRemoved(), lastTotalInputTokens: getTotalInputTokens(), lastTotalOutputTokens: getTotalOutputTokens(), lastTotalCacheCreationInputTokens: getTotalCacheCreationInputTokens(), lastTotalCacheReadInputTokens: getTotalCacheReadInputTokens(), lastTotalWebSearchRequests: getTotalWebSearchRequests(), lastModelUsage: Object.fromEntries( Object.entries(getModelUsage()).map(([model, usage]) => [model, { inputTokens: usage.inputTokens, outputTokens: usage.outputTokens, cacheReadInputTokens: usage.cacheReadInputTokens, cacheCreationInputTokens: usage.cacheCreationInputTokens, webSearchRequests: usage.webSearchRequests, costUSD: usage.costUSD, }]), ), lastSessionId: getSessionId(), }));}When restoring a session, the costs are loaded back:
export function restoreCostStateForSession(sessionId: string): boolean { const data = getStoredSessionCosts(sessionId); if (!data) return false; setCostStateForRestore(data); return true;}This only restores if the sessionId matches — preventing cross-session cost contamination.
Cost Display in the TUI
Section titled “Cost Display in the TUI”The formatTotalCost() function produces the cost summary shown on exit:
export function formatTotalCost(): string { const costDisplay = formatCost(getTotalCostUSD()) + (hasUnknownModelCost() ? ' (costs may be inaccurate due to usage of unknown models)' : '');
return chalk.dim( `Total cost: ${costDisplay}\n` + `Total duration (API): ${formatDuration(getTotalAPIDuration())}\n` + `Total duration (wall): ${formatDuration(getTotalDuration())}\n` + `Total code changes: ${getTotalLinesAdded()} lines added, ${getTotalLinesRemoved()} lines removed\n` + `${formatModelUsage()}` );}Per-model breakdown uses canonical (short) names:
function formatModelUsage(): string { const usageByShortName = {}; for (const [model, usage] of Object.entries(getModelUsage())) { const shortName = getCanonicalName(model); // Accumulate by short name (e.g., "sonnet" combines all sonnet variants) usageByShortName[shortName].inputTokens += usage.inputTokens; // ... }
let result = 'Usage by model:'; for (const [shortName, usage] of Object.entries(usageByShortName)) { result += `\n${shortName}: ${formatNumber(usage.inputTokens)} input, ` + `${formatNumber(usage.outputTokens)} output, ` + `${formatNumber(usage.cacheReadInputTokens)} cache read, ` + `${formatNumber(usage.cacheCreationInputTokens)} cache write` + ` ($${usage.costUSD.toFixed(4)})`; } return result;}Example output:
Total cost: $0.0847Total duration (API): 45.2sTotal duration (wall): 1m 12sTotal code changes: 42 lines added, 7 lines removedUsage by model: sonnet: 125,432 input, 8,291 output, 98,100 cache read, 27,332 cache write ($0.0847)Cost Formatting Rules
Section titled “Cost Formatting Rules”The formatCost function uses adaptive precision:
function formatCost(cost: number, maxDecimalPlaces: number = 4): string { return `$${cost > 0.5 ? round(cost, 100).toFixed(2) // > $0.50 → 2 decimal places : cost.toFixed(maxDecimalPlaces) // ≤ $0.50 → 4 decimal places }`;}$0.0012→ shows 4 decimal places (cheap queries)$3.47→ shows 2 decimal places (expensive sessions)