Cut Claude API Costs 94% With HTML Comment Tiers
This article explains how to reduce Claude API costs by up to 94% using an HTML comment tier system that strategically organizes prompt content to minimize
Someone built a tiering system that cut their Claude API bills by 94% using simple HTML comments.
The trick: tag docs with <!-- @cortex-tms-tier HOT -->, WARM, or COLD markers. Claude only reads HOT files (active tasks) by default, skipping old changelogs and archived stuff that bloats the context window.
Real numbers from their own project:
- Before: 66,834 tokens/session ($0.11 per Claude call)
- After: 3,647 tokens/session ($0.01 per call)
Check tier breakdown with:
The tool is open source at https://github.com/cortex-tms/cortex-tms and already has 1,000+ NPM downloads. Pretty smart approach - most codebases have tons of rarely-needed docs that eat tokens every single session. Tagging files takes minutes but saves real money over time, especially on projects with heavy API usage.
Bonus: responses come back faster since Claude processes less context.
Related Tips
DIY API: Turn Claude Pro Into API Access Via VPS
A guide explaining how users can set up a VPS to create their own API endpoint for Claude Pro by automating browser interactions, effectively converting the
DIY Claude API Using Your Pro Subscription
This guide explains how developers can leverage their existing Claude Pro subscription to access Claude AI programmatically through custom API implementations
Claude Remembers When You Code Its Instructions
Claude demonstrates meta-awareness by recalling and referencing the specific instructions it receives, showing how the AI can track and reflect on