Claude Code Cache Bug Breaks Session Resume
A bug in Claude Code's session management system destroys prompt cache efficiency when developers resume work by inadvertently deleting critical data through a
Bug Breaks Claude Code Prompt Caching on Resume
What It Is
A recently discovered bug in Claude Code’s session management system has been silently destroying prompt cache efficiency whenever developers resume work. The issue stems from a minified function called db8 in the cli.js file that manages session persistence in ~/.claude/projects/*.jsonl files.
The function was designed to clean up attachment-type messages from saved sessions, but it inadvertently deletes deferred_tools_delta records as well. These records serve a critical purpose: they track which tools have already been announced to the language model during previous interactions. When Claude Code resumes a session and can’t locate these records, it assumes the tools were never introduced and re-announces the entire toolset from scratch.
This re-announcement shifts the cache breakpoint, invalidating previously cached prompt prefixes. Instead of hitting cached content from earlier in the session, the system treats each resumed session as essentially new, forcing fresh API calls and token processing for content that should have been cached.
Why It Matters
Prompt caching represents one of the most significant cost and performance optimizations available in modern LLM applications. When working properly, it allows subsequent API calls to reuse previously processed prompt segments, dramatically reducing both latency and token consumption.
The impact of this bug extends beyond mere inconvenience. Developers working on extended coding sessions reported cache usage rates spiking to unsustainable levels - burning through API limits that should have lasted hours within a fraction of that time. After applying the fix, usage patterns normalized to approximately 6% consumption per five-hour session, compared to the dramatically higher rates experienced with the bug active.
For teams relying on Claude Code for daily development work, this bug translated directly into either increased API costs or workflow interruptions when rate limits kicked in. The silent nature of the failure made it particularly insidious - sessions appeared to function normally while quietly consuming far more resources than necessary.
Getting Started
The fix is available as a patch that preserves tool announcement records during session cleanup. Developers can access it at https://github.com/Rangizingo/cc-cache-fix
To apply the patch:
After cloning the repository, the recommended approach involves pointing Claude Code itself at the repo and requesting it to apply the patch. This method works across Linux, macOS, and Windows environments, leveraging Claude’s ability to understand and modify its own codebase structure.
The patch modifies the db8 function to distinguish between attachment messages that should be removed and tool announcement records that must persist. This ensures cache prefixes remain consistent when sessions resume, allowing the caching mechanism to function as designed.
Context
This bug highlights a broader challenge in LLM application development: the complexity of state management across session boundaries. While many developers focus on prompt engineering and model selection, the infrastructure surrounding API calls - caching strategies, session persistence, and state tracking - often determines real-world performance and cost efficiency.
Alternative approaches to session management exist, including storing complete conversation histories or implementing custom caching layers. However, these typically involve trade-offs between storage overhead and cache hit rates. The Claude Code approach of selective message persistence aims to balance these concerns, though this bug revealed an edge case where the cleanup logic was too aggressive.
The minified nature of the affected code (cli.js) also raises questions about debugging and maintenance in production tools. Minification improves load times and reduces distribution size, but it can obscure issues like this one, where function names like db8 provide little semantic meaning about their purpose or side effects.
For developers experiencing unexpectedly high API usage with Claude Code, examining session resume behavior should now be part of the diagnostic checklist. The fix demonstrates that even well-designed caching systems require careful attention to state preservation across the full lifecycle of user interactions.
Related Tips
Claude Code Bug Breaks Cache on Billing Strings
A critical bug in Claude Code's standalone binary breaks prompt caching when conversations contain billing-related strings, causing the system to perform
Terminal 3D Gaussian Splatting via 80 AI Agents
A developer built a 3D Gaussian splat renderer running in terminal using ASCII characters, created entirely through orchestrating over 80 Claude AI agents in a
Automated Claude Task Scheduler with Git Isolation
Dreamer is an automation scheduler that runs Claude coding tasks on a timer using cron or natural language scheduling, maintaining isolation through git