Claude Prompt Caching
Claude prompt caching is represented in v2 as request processing on
provider-native Claude Messages bodies. The current runtime supports
cache_breakpoint rules and channel request shaping for Claude-compatible
traffic.
Cache rules run after protocol transform. A Gemini or OpenAI client routed to a
Claude target can still receive Claude cache markers because the body has
already been transformed to claude_messages.
Cache Breakpoint Rule
Section titled “Cache Breakpoint Rule”A cache_breakpoint rule config looks like:
{ "target": "system", "index": 1, "ttl": "5m"}Supported fields:
| Field | Meaning |
|---|---|
target | top_level, system, tools, or last_message. |
index | Console-facing signed index: >0 means Nth from the start, <0 means Nth from the end, and 0 is invalid. If omitted, the runtime uses the last block. Ignored for top_level. |
ttl | Optional TTL string such as 5m or 1h. |
position | Reserved for compatibility; currently unused. |
The rule only applies when the target operation kind is claude_messages.
Non-Claude targets warn and skip.
Targets
Section titled “Targets”| Target | Where the marker is inserted |
|---|---|
top_level | On the request root (global), enabling Anthropic’s automatic prompt caching. index/position are ignored. |
system | An item in a Claude system array. A string-form system cannot carry block metadata and is skipped. |
tools | An item in the top-level tools array. |
last_message | A content block in the last message’s content array. |
The runtime writes:
{ "cache_control": { "type": "ephemeral" } }If ttl is set, it is added to that object.
Beta Headers
Section titled “Beta Headers”Anthropic’s one-hour cache TTL requires a beta header. In v2, add that with a
header rule attached to the same provider:
{ "name": "anthropic-beta", "value": "extended-cache-ttl-2025-04-11", "mode": "merge"}Use merge so existing client beta tokens are preserved.
Rule Ordering
Section titled “Rule Ordering”The process layer applies:
system_text -> cache_breakpoint -> rewrite -> transform -> headerThat means server-managed system text is inserted before cache breakpoints are placed. Later rewrite or transform rules can still change cached content, so be careful when combining prompt rewriting with prompt caching.
Design Direction
Section titled “Design Direction”The current cache_breakpoint kind is specialized. The planned generic
transform model would express it as a structural locator plus a merge action,
with provider-specific path selection generated by console presets. The backend
should remain permissive: it should apply the configured mutation and let the
upstream enforce provider policy such as breakpoint limits.