跳转到内容

Pattern: Graceful Degradation

高强度 API 调用的系统不可避免地会遭遇压力:速率限制、服务器过载、网络不稳定或配额耗尽。Graceful Degradation 在检测到压力时自动从快速/乐观模式切换到慢速/保守模式,并在压力消退时切换回来。

与简单重试逻辑的关键区别:降级是模态的 —— 整个系统调整其行为,而不仅仅是单个请求。

stateDiagram-v2
    [*] --> Fast

    Fast --> Degraded: Error rate > threshold
    Fast --> Degraded: Rate limit hit
    Fast --> Degraded: Latency spike

    Degraded --> Cooldown: Consecutive errors
    Degraded --> Fast: Cooldown expires + success

    Cooldown --> Degraded: Cooldown timer expires
    Cooldown --> Cooldown: Still failing

    note right of Fast: Full parallelism\nAggressive prefetch\nOptimistic caching
    note right of Degraded: Sequential execution\nNo prefetch\nConservative timeouts
    note right of Cooldown: Pause new requests\nWait for recovery
interface FastModeConfig {
maxConcurrency: 5; // 并行 API 调用
prefetchEnabled: true; // 推测性预取可能需要的数据
retryCount: 1; // 短暂故障时快速重试
retryDelay: 500; // 重试间隔 500ms
timeout: 30_000; // 每个请求 30s 超时
batchSize: 10; // 一次处理 10 个项目
}

Fast 模式下系统是乐观的:并行发起请求、推测性预取、使用短超时。这是一切正常时的默认状态。

interface DegradedModeConfig {
maxConcurrency: 1; // 仅顺序执行
prefetchEnabled: false; // 不在推测上浪费配额
retryCount: 3; // 更多重试(带退避)
retryDelay: 2_000; // 重试间隔 2s
timeout: 60_000; // 每个请求 60s 超时(更耐心)
batchSize: 1; // 每次处理一个项目
}

Degraded 模式节省资源:顺序执行、不预取、更长超时、更耐心的重试。Agent 更慢,但仍然可用

interface CooldownConfig {
pauseDuration: 30_000; // 探测前暂停 30s
probeInterval: 10_000; // 每 10s 进行健康检查
requiredSuccesses: 3; // 需要 3 次成功探测才能恢复
maxCooldownDuration: 300_000; // Cooldown 最长 5 分钟
}

Cooldown 模式暂停新任务,并定期用轻量级请求探测 API 以检测是否恢复。

class DegradationController {
private mode: 'fast' | 'degraded' | 'cooldown' = 'fast';
private errorWindow: number[] = []; // 近期错误的时间戳
private successCount = 0;
private cooldownStart = 0;
private readonly ERROR_WINDOW_MS = 60_000; // 1 分钟滑动窗口
private readonly ERROR_THRESHOLD = 3; // 窗口内 3 个错误 → 降级
private readonly COOLDOWN_THRESHOLD = 5; // 5 个连续错误 → cooldown
private readonly RECOVERY_SUCCESSES = 3; // 3 次成功 → 恢复
recordSuccess() {
this.successCount++;
if (this.mode === 'cooldown' && this.successCount >= this.RECOVERY_SUCCESSES) {
this.transitionTo('fast');
} else if (this.mode === 'degraded') {
// 在 degraded 模式下,跟踪恢复窗口
if (this.successCount >= this.RECOVERY_SUCCESSES * 2) {
this.transitionTo('fast');
}
}
}
recordError(error: Error) {
const now = Date.now();
this.successCount = 0;
// 添加到滑动窗口
this.errorWindow.push(now);
this.errorWindow = this.errorWindow.filter(t => now - t < this.ERROR_WINDOW_MS);
if (this.mode === 'fast' && this.errorWindow.length >= this.ERROR_THRESHOLD) {
this.transitionTo('degraded');
} else if (this.mode === 'degraded' && this.errorWindow.length >= this.COOLDOWN_THRESHOLD) {
this.transitionTo('cooldown');
}
}
private transitionTo(newMode: 'fast' | 'degraded' | 'cooldown') {
const oldMode = this.mode;
this.mode = newMode;
if (newMode === 'cooldown') {
this.cooldownStart = Date.now();
}
if (newMode === 'fast') {
this.errorWindow = [];
this.successCount = 0;
}
console.log(`[Degradation] ${oldMode}${newMode}`);
}
getMode() { return this.mode; }
getConfig(): ModeConfig {
switch (this.mode) {
case 'fast': return FAST_CONFIG;
case 'degraded': return DEGRADED_CONFIG;
case 'cooldown': return COOLDOWN_CONFIG;
}
}
}

不是所有错误都适合相同的重试行为:

interface RetryStrategy {
shouldRetry: boolean;
delay: number;
degradeMode: boolean;
}
function classifyError(error: APIError): RetryStrategy {
switch (error.status) {
// 短暂错误 —— 快速重试
case 500: // 内部服务器错误
case 502: // 网关错误
case 503: // 服务不可用
return { shouldRetry: true, delay: 1_000, degradeMode: false };
// 速率限制 —— 退避重试,降级模式
case 429:
const retryAfter = error.headers['retry-after']
? parseInt(error.headers['retry-after']) * 1000
: 30_000;
return { shouldRetry: true, delay: retryAfter, degradeMode: true };
// 过载 —— 长退避,必须降级
case 529:
return { shouldRetry: true, delay: 60_000, degradeMode: true };
// 客户端错误 —— 不重试
case 400: // 请求格式错误
case 401: // 未授权
case 403: // 禁止
return { shouldRetry: false, delay: 0, degradeMode: false };
// 未知 —— 保守重试一次
default:
return { shouldRetry: true, delay: 5_000, degradeMode: false };
}
}
function calculateBackoff(attempt: number, baseDelay: number): number {
// 指数:1s, 2s, 4s, 8s, 16s...
const exponential = baseDelay * Math.pow(2, attempt);
// 上限 60 秒
const capped = Math.min(exponential, 60_000);
// 添加抖动(±25%)以防止惊群效应
const jitter = capped * (0.75 + Math.random() * 0.5);
return Math.floor(jitter);
}
// 示例进程:
// 第 0 次尝试:1000ms(± 250ms 抖动)
// 第 1 次尝试:2000ms(± 500ms 抖动)
// 第 2 次尝试:4000ms(± 1000ms 抖动)
// 第 3 次尝试:8000ms(± 2000ms 抖动)
// 第 4 次尝试:16000ms(± 4000ms 抖动)
// 第 5 次尝试:32000ms(± 8000ms 抖动)
// 第 6 次及以上:60000ms(上限,± 15000ms 抖动)
sequenceDiagram
    participant Agent
    participant Controller
    participant API

    Note over Agent,API: Fast Mode
    Agent->>API: Request 1 ✅
    Agent->>API: Request 2 ✅
    Agent->>API: Request 3 ❌ 429
    Agent->>Controller: recordError()
    Agent->>API: Request 4 ❌ 429
    Agent->>Controller: recordError()
    Agent->>API: Request 5 ❌ 529
    Agent->>Controller: recordError()
    Controller-->>Agent: Mode → Degraded

    Note over Agent,API: Degraded Mode (Sequential)
    Agent->>API: Request 6 ❌ 529
    Agent->>API: Request 7 ❌ 529
    Controller-->>Agent: Mode → Cooldown

    Note over Agent,API: Cooldown (30s pause)
    Note over Agent: Waiting...
    Agent->>API: Health probe ❌
    Note over Agent: Wait 10s...
    Agent->>API: Health probe ✅
    Agent->>API: Health probe ✅
    Agent->>API: Health probe ✅
    Controller-->>Agent: Mode → Fast

    Note over Agent,API: Fast Mode (Recovered)
    Agent->>API: Request 8 ✅
async function cooldownProbe(
apiClient: APIClient,
controller: DegradationController,
config: CooldownConfig,
): Promise<void> {
const start = Date.now();
let consecutiveSuccesses = 0;
while (
controller.getMode() === 'cooldown' &&
Date.now() - start < config.maxCooldownDuration
) {
await delay(config.probeInterval);
try {
// 轻量级探测 —— 最小化 token 使用
await apiClient.complete({
messages: [{ role: 'user', content: 'ping' }],
maxTokens: 1,
});
consecutiveSuccesses++;
controller.recordSuccess();
if (consecutiveSuccesses >= config.requiredSuccesses) {
return; // 控制器将转换到 fast 模式
}
} catch (error) {
consecutiveSuccesses = 0;
controller.recordError(error);
}
}
}
// ============================================
// 可复用 Graceful Degradation 包装器
// ============================================
interface DegradableClient<T> {
execute(request: T): Promise<unknown>;
getMode(): 'fast' | 'degraded' | 'cooldown';
getStats(): DegradationStats;
}
interface DegradationStats {
mode: string;
totalRequests: number;
totalErrors: number;
modeTransitions: number;
averageLatency: number;
}
function withGracefulDegradation<T>(
client: { execute: (req: T) => Promise<unknown> },
options?: Partial<DegradationOptions>,
): DegradableClient<T> {
const controller = new DegradationController();
const stats = { totalRequests: 0, totalErrors: 0, modeTransitions: 0, latencies: [] as number[] };
return {
async execute(request: T) {
const config = controller.getConfig();
// 遵守 cooldown
if (controller.getMode() === 'cooldown') {
await cooldownProbe(client as any, controller, COOLDOWN_CONFIG);
}
// 应用模式特定的配置
let lastError: Error | null = null;
for (let attempt = 0; attempt <= config.retryCount; attempt++) {
if (attempt > 0) {
await delay(calculateBackoff(attempt, config.retryDelay));
}
const start = performance.now();
stats.totalRequests++;
try {
const result = await Promise.race([
client.execute(request),
timeout(config.timeout),
]);
const latency = performance.now() - start;
stats.latencies.push(latency);
controller.recordSuccess();
return result;
} catch (error) {
lastError = error as Error;
stats.totalErrors++;
const strategy = classifyError(error as APIError);
controller.recordError(error as Error);
if (!strategy.shouldRetry) throw error;
}
}
throw lastError;
},
getMode() { return controller.getMode(); },
getStats() {
return {
mode: controller.getMode(),
totalRequests: stats.totalRequests,
totalErrors: stats.totalErrors,
modeTransitions: stats.modeTransitions,
averageLatency: stats.latencies.reduce((a, b) => a + b, 0) / stats.latencies.length || 0,
};
},
};
}
维度短期(短暂)长期(持续)
触发条件1 分钟内 1-3 个错误5 分钟内 5 个以上错误
动作带退避的重试切换到 degraded 模式
恢复下次成功后自动恢复需要 N 次连续成功
持续时间秒级分钟到小时
影响用户几乎感知不到用户看到更慢但可用的系统
示例网络抖动、502速率限制耗尽、服务中断
async function* agentLoopWithDegradation(
messages: Message[],
tools: Tool[],
): AsyncGenerator<AgentEvent> {
const apiClient = withGracefulDegradation(rawApiClient);
while (true) {
const mode = apiClient.getMode();
// 根据模式调整行为
if (mode === 'degraded') {
yield { type: 'status', message: '⚠️ Operating in degraded mode (slower but functional)' };
}
if (mode === 'cooldown') {
yield { type: 'status', message: '⏸️ API cooling down, will resume shortly...' };
}
try {
const response = await apiClient.execute({
system: systemPrompt,
messages,
tools: mode === 'fast' ? tools : essentialToolsOnly(tools),
});
yield { type: 'response', data: response };
// 在 degraded 模式下,不进行并行 tool 执行
if (mode === 'degraded') {
for (const call of response.toolCalls) {
const result = await executeTool(call);
yield { type: 'tool_result', data: result };
}
} else {
// Fast 模式:并行执行
const results = await Promise.all(
response.toolCalls.map(call => executeTool(call))
);
for (const result of results) {
yield { type: 'tool_result', data: result };
}
}
} catch (error) {
yield { type: 'error', data: error };
if (isUnrecoverable(error)) return;
}
}
}

LLM API 客户端

任何调用 OpenAI、Anthropic 或其他有速率限制和偶尔中断的 LLM API 的应用。

微服务系统

当下游依赖变慢或失败时的服务网格降级。

实时数据流水线

需要处理来自慢速接收端的 backpressure 的流处理系统。

移动端应用

必须在不稳定网络上正常工作的应用,通过自动减少数据使用量和功能丰富度来适应。

  1. 模式,而非仅仅重试:降级改变系统的整体行为模式,而不仅仅是单个请求的重试次数
  2. 自动转换:模式变更基于测量的错误率,而非人工干预
  3. 迟滞效应:恢复需要持续成功(3 次以上连续),防止模式来回震荡
  4. 透明度:系统降级时通知用户(“正在以降级模式运行”)
  5. 连续性:即使在最坏的情况下(cooldown),系统最终也会恢复,而不是永久失败