提示快取

提示快取意味著模型提供商可以跨回合重複使用未變更的提示前綴（通常是系統／開發人員指示和其他穩定的上下文），而不是每次都重新處理。第一個相符的請求寫入快取令牌（cacheWrite），稍後相符的請求可以將其讀回（cacheRead）。為什麼這很重要：降低令牌成本、更快的回應和長期執行工作階段的更可預測性能。沒有快取，重複提示在每個回合都要支付完整的提示成本，即使大部分輸入沒有改變。本頁涵蓋影響提示重複使用和令牌成本的所有快取相關控制項。如需 Anthropic 定價詳細資訊，請參閱： https://docs.anthropic.com/docs/build-with-claude/prompt-caching

Primary knobs

`cacheRetention` (model and per-agent)

Set cache retention on model params:

agents:
  defaults:
    models:
      "anthropic/claude-opus-4-6":
        params:
          cacheRetention: "short" # none | short | long

Per-agent override:

agents:
  list:
    - id: "alerts"
      params:
        cacheRetention: "none"

Config merge order:

agents.defaults.models["provider/model"].params
agents.list[].params (matching agent id; overrides by key)

Legacy `cacheControlTtl`

Legacy values are still accepted and mapped:

5m -> short
1h -> long

Prefer cacheRetention for new config.

`contextPruning.mode: "cache-ttl"`

Prunes old tool-result context after cache TTL windows so post-idle requests do not re-cache oversized history.

agents:
  defaults:
    contextPruning:
      mode: "cache-ttl"
      ttl: "1h"

See Session Pruning for full behavior.

Heartbeat keep-warm

Heartbeat can keep cache windows warm and reduce repeated cache writes after idle gaps.

agents:
  defaults:
    heartbeat:
      every: "55m"

Per-agent heartbeat is supported at agents.list[].heartbeat.

Provider behavior

Anthropic (direct API)

cacheRetention is supported.
With Anthropic API-key auth profiles, OpenClaw seeds cacheRetention: "short" for Anthropic model refs when unset.

Amazon Bedrock

Anthropic Claude model refs (amazon-bedrock/*anthropic.claude*) support explicit cacheRetention pass-through.
Non-Anthropic Bedrock models are forced to cacheRetention: "none" at runtime.

OpenRouter Anthropic models

For openrouter/anthropic/* model refs, OpenClaw injects Anthropic cache_control on system/developer prompt blocks to improve prompt-cache reuse.

Other providers

If the provider does not support this cache mode, cacheRetention has no effect.

Tuning patterns

Mixed traffic (recommended default)

Keep a long-lived baseline on your main agent, disable caching on bursty notifier agents:

agents:
  defaults:
    model:
      primary: "anthropic/claude-opus-4-6"
    models:
      "anthropic/claude-opus-4-6":
        params:
          cacheRetention: "long"
  list:
    - id: "research"
      default: true
      heartbeat:
        every: "55m"
    - id: "alerts"
      params:
        cacheRetention: "none"

Cost-first baseline

Set baseline cacheRetention: "short".
Enable contextPruning.mode: "cache-ttl".
Keep heartbeat below your TTL only for agents that benefit from warm caches.

Cache diagnostics

OpenClaw exposes dedicated cache-trace diagnostics for embedded agent runs.

`diagnostics.cacheTrace` config

diagnostics:
  cacheTrace:
    enabled: true
    filePath: "~/.openclaw/logs/cache-trace.jsonl" # optional
    includeMessages: false # default true
    includePrompt: false # default true
    includeSystem: false # default true

Defaults:

filePath: $OPENCLAW_STATE_DIR/logs/cache-trace.jsonl
includeMessages: true
includePrompt: true
includeSystem: true

Env toggles (one-off debugging)

OPENCLAW_CACHE_TRACE=1 enables cache tracing.
OPENCLAW_CACHE_TRACE_FILE=/path/to/cache-trace.jsonl overrides output path.
OPENCLAW_CACHE_TRACE_MESSAGES=0|1 toggles full message payload capture.
OPENCLAW_CACHE_TRACE_PROMPT=0|1 toggles prompt text capture.
OPENCLAW_CACHE_TRACE_SYSTEM=0|1 toggles system prompt capture.

What to inspect

Cache trace events are JSONL and include staged snapshots like session:loaded, prompt:before, stream:context, and session:after.
Per-turn cache token impact is visible in normal usage surfaces via cacheRead and cacheWrite (for example /usage full and session usage summaries).

Quick troubleshooting

High cacheWrite on most turns: check for volatile system-prompt inputs and verify model/provider supports your cache settings.
No effect from cacheRetention: confirm model key matches agents.defaults.models["provider/model"].
Bedrock Nova/Mistral requests with cache settings: expected runtime force to none.

Related docs:

CLI 指令

RPC 與 API

範本

技術參考

概念內部機制

專案

發布政策

Prompt Caching（提示快取）

提示快取

Primary knobs

`cacheRetention` (model and per-agent)

Legacy `cacheControlTtl`

`contextPruning.mode: "cache-ttl"`

Heartbeat keep-warm

Provider behavior

Anthropic (direct API)

Amazon Bedrock

OpenRouter Anthropic models

Other providers

Tuning patterns

Mixed traffic (recommended default)

Cost-first baseline

Cache diagnostics

`diagnostics.cacheTrace` config

Env toggles (one-off debugging)

What to inspect

Quick troubleshooting

CLI 指令

RPC 與 API

範本

技術參考

概念內部機制

專案

發布政策

​提示快取

​Primary knobs

​cacheRetention (model and per-agent)

​Legacy cacheControlTtl

​contextPruning.mode: "cache-ttl"

​Heartbeat keep-warm

​Provider behavior

​Anthropic (direct API)

​Amazon Bedrock

​OpenRouter Anthropic models

​Other providers

​Tuning patterns

​Mixed traffic (recommended default)

​Cost-first baseline

​Cache diagnostics

​diagnostics.cacheTrace config

​Env toggles (one-off debugging)

​What to inspect

​Quick troubleshooting

提示快取

Primary knobs

`cacheRetention` (model and per-agent)

Legacy `cacheControlTtl`

`contextPruning.mode: "cache-ttl"`

Heartbeat keep-warm

Provider behavior

Anthropic (direct API)

Amazon Bedrock

OpenRouter Anthropic models

Other providers

Tuning patterns

Mixed traffic (recommended default)

Cost-first baseline

Cache diagnostics

`diagnostics.cacheTrace` config

Env toggles (one-off debugging)

What to inspect

Quick troubleshooting