四层 compression
随着对话增长,最终会逼近模型的 context window 上限。Claude Code 实现了多层 compression 策略——从轻量级内容裁剪到完整对话摘要——以在不丢失关键 context 的情况下持续运行。
Compression 层级
Section titled “Compression 层级”graph TD
A[Full Conversation Context] -->|approaching limit| B{Which layer?}
B -->|"lightweight"| C[Layer 1: Snip<br/>Truncate large tool results]
B -->|"targeted"| D[Layer 2: Microcompact<br/>Cache-aware inline editing]
B -->|"structural"| E[Layer 3: Context Collapse<br/>Drop or summarize old turns]
B -->|"full reset"| F[Layer 4: Auto Compact<br/>Summarize entire conversation]
style C fill:#e8f5e9
style D fill:#fff3e0
style E fill:#fce4ec
style F fill:#e3f2fd
每一层都比上一层更激进。系统优先尝试较轻的方式,仅在必要时才升级。
第一层:Snip(tool 结果截断)
Section titled “第一层:Snip(tool 结果截断)”最轻量的 compression。超过大小限制的 tool 结果(文件读取、grep 输出、bash 输出)会被截断。这在消息构建时内联发生——超出限制的 tool 结果会被截断并附上截断提示。
// Conceptual — tool results are trimmed before being sent to the API// "Output truncated. Total: 45000 chars. Showing first 30000 chars."核心洞察:大多数 tool 输出远大于模型所需。一个 10,000 行的 grep 结果通常只有 5-10 个相关匹配项埋藏在噪声之中。
第二层:Cached Microcompact
Section titled “第二层:Cached Microcompact”Microcompact 在 API cache 层面运作——它使用 cache 编辑来删除或缩减已被 cache 的内容,而无需重新发送整个 context。此功能由 CACHED_MICROCOMPACT feature flag 控制。
// src/constants/prompts.ts — conditional importconst getCachedMCConfigForFRC = feature('CACHED_MICROCOMPACT') ? require('../services/compact/cachedMCConfig.js').getCachedMCConfig : nullcache 中断检测系统中的 cacheDeletionsPending flag 追踪 microcompact 发送删除操作的时机,以便将随之而来的 cache 读取 token 下降不误报为 cache 中断。
哪些内容会被 microcompact 处理:
FILE_UNCHANGED_STUB——当文件被重新读取且内容未变时,完整内容被替换为占位符- 不再被引用的旧 tool 结果
- 过时的搜索结果
第三层:Partial Compact(context 折叠)
Section titled “第三层:Partial Compact(context 折叠)”Partial compaction 对对话的一部分进行摘要,同时保留近期消息的原文。存在两种变体:
“from” 方向(默认)
Section titled ““from” 方向(默认)”对最旧的消息进行摘要,保留最近消息的原文:
graph LR
subgraph "Before partial compact"
A[Old turns 1-50] --> B[Recent turns 51-80]
end
subgraph "After partial compact"
C[Summary of turns 1-50] --> D[Recent turns 51-80<br/>preserved verbatim]
end
const PARTIAL_COMPACT_PROMPT = `Your task is to create a detailed summary of theRECENT portion of the conversation — the messages that follow earlier retainedcontext. The earlier messages are being kept intact and do NOT need to be summarized.Focus your summary on what was discussed, learned, and accomplished in the recentmessages only.`“up_to” 方向
Section titled ““up_to” 方向”对直到某个边界点的所有内容进行摘要,生成的摘要放置在保留的近期消息之前:
const PARTIAL_COMPACT_UP_TO_PROMPT = `Your task is to create a detailed summary of thisconversation. This summary will be placed at the start of a continuing session; newermessages that build on this context will follow after your summary.`摘要以 SystemCompactBoundaryMessage 的形式插入,标记从摘要内容到原文内容的过渡点。
第四层:Full Auto Compact
Section titled “第四层:Full Auto Compact”最激进的 compression。当 context 使用量达到临界阈值时,整个对话被摘要为一份单一的结构化文档。src/services/compact/prompt.ts 中的 prompt 模板精确定义了摘要必须涵盖的内容:
// src/services/compact/prompt.ts — BASE_COMPACT_PROMPT (structure)const BASE_COMPACT_PROMPT = `Your task is to create a detailed summary of theconversation so far...
Your summary should include the following sections:
1. Primary Request and Intent2. Key Technical Concepts3. Files and Code Sections (with full code snippets)4. Errors and fixes5. Problem Solving6. All user messages (non-tool-result)7. Pending Tasks8. Current Work9. Optional Next Step`禁止 tool 调用的强制措施
Section titled “禁止 tool 调用的强制措施”compact prompt 强力阻止摘要模型调用 tool:
const NO_TOOLS_PREAMBLE = `CRITICAL: Respond with TEXT ONLY. Do NOT call any tools.
- Do NOT use Read, Bash, Grep, Glob, Edit, Write, or ANY other tool.- You already have all the context you need in the conversation above.- Tool calls will be REJECTED and will waste your only turn — you will fail the task.`并在结尾再次强调:
const NO_TOOLS_TRAILER = '\n\nREMINDER: Do NOT call any tools. Respond with plain text only — ' + 'an <analysis> block followed by a <summary> block.'先分析后摘要模式
Section titled “先分析后摘要模式”compact 过程使用两阶段输出格式:
<analysis>[Detailed chronological analysis — a drafting scratchpad]</analysis>
<summary>1. Primary Request and Intent: ...2. Key Technical Concepts: ......</summary><analysis> 块通过强制模型在摘要前梳理对话来提升摘要质量。随后由 formatCompactSummary() 在摘要进入对话前将其剥离:
export function formatCompactSummary(summary: string): string { // Strip analysis scratchpad formattedSummary = formattedSummary.replace(/<analysis>[\s\S]*?<\/analysis>/, '') // Extract and format summary section const summaryMatch = formattedSummary.match(/<summary>([\s\S]*?)<\/summary>/) if (summaryMatch) { formattedSummary = formattedSummary.replace( /<summary>[\s\S]*?<\/summary>/, `Summary:\n${content.trim()}`, ) } return formattedSummary.trim()}Compaction 后的消息
Section titled “Compaction 后的消息”压缩后的摘要以用户消息形式注入,附带来源说明:
export function getCompactUserSummaryMessage( summary: string, suppressFollowUpQuestions?: boolean, transcriptPath?: string,): string { let baseSummary = `This session is being continued from a previous conversationthat ran out of context. The summary below covers the earlier portion.
${formattedSummary}`
if (transcriptPath) { baseSummary += `\n\nIf you need specific details from before compaction,read the full transcript at: ${transcriptPath}` }
if (suppressFollowUpQuestions) { baseSummary += `\nContinue the conversation from where it left off withoutasking the user any further questions.` }}自定义 Compact 指令
Section titled “自定义 Compact 指令”用户可以提供自定义指令来引导摘要的关注重点:
export function getCompactPrompt(customInstructions?: string): string { let prompt = NO_TOOLS_PREAMBLE + BASE_COMPACT_PROMPT if (customInstructions && customInstructions.trim() !== '') { prompt += `\n\nAdditional Instructions:\n${customInstructions}` } prompt += NO_TOOLS_TRAILER return prompt}prompt 中的示例:
- “When summarizing focus on typescript code changes and mistakes”
- “Focus on test output and code changes. Include file reads verbatim.”
Compact 生命周期
Section titled “Compact 生命周期”sequenceDiagram
participant L as Agentic Loop
participant C as Compact System
participant A as API (Summarizer)
participant S as Session State
L->>L: Check context usage
L->>C: Context threshold exceeded
C->>C: Choose compact strategy<br/>(full vs partial)
C->>C: Build compact prompt<br/>(NO_TOOLS + template)
C->>A: Send conversation + compact prompt<br/>(maxTurns: 1)
A-->>C: <analysis>...</analysis><br/><summary>...</summary>
C->>C: formatCompactSummary()<br/>(strip analysis)
C->>S: Insert CompactBoundaryMessage
C->>S: Notify cache detection system
L->>L: Resume with compressed context
Hook 集成
Section titled “Hook 集成”pre-compact 和 post-compact hook 允许外部系统观察和响应 compaction 事件:
import { executePostCompactHooks, executePreCompactHooks } from '../../utils/hooks.js'这支持以下集成场景:
- 记录 compaction 事件用于分析
- 保存 compact 前的状态用于调试
- 在 context 丢失前触发 memory 提取