How to Calculate Your Claude Code Context Usage
04/11/2025 • Melvynx
Ever wondered how Claude Code tracks your context usage? That percentage showing up in /context isn't magic—it's a specific calculation.
I reverse-engineered the code that powers this feature, and I'm going to show you exactly how it works.
What I Found
Claude Code stores conversation data in a transcript file (basically a JSONL file—one JSON object per line). Each line represents an event: messages, API calls, tool usage, everything.
The key insight? Not all lines count toward your context.
Here's the code that does the heavy lifting:
export async function getContextLength(
transcriptPath: string,
): Promise<number> {
const content = await Bun.file(transcriptPath).text();
const lines = content.trim().split("\n");
if (lines.length === 0) return 0;
let mostRecentMainChainEntry: TranscriptLine | null = null;
let mostRecentTimestamp: Date | null = null;
for (const line of lines) {
try {
const data = JSON.parse(line) as TranscriptLine;
if (!data.message?.usage) continue;
if (data.isSidechain === true) continue;
if (data.isApiErrorMessage === true) continue;
if (!data.timestamp) continue;
const entryTime = new Date(data.timestamp);
if (!mostRecentTimestamp || entryTime > mostRecentTimestamp) {
mostRecentTimestamp = entryTime;
mostRecentMainChainEntry = data;
}
} catch {}
}
if (!mostRecentMainChainEntry?.message?.usage) {
return 0;
}
const usage = mostRecentMainChainEntry.message.usage;
return (
(usage.input_tokens || 0) +
(usage.cache_read_input_tokens ?? 0) +
(usage.cache_creation_input_tokens ?? 0)
);
}The Logic Breakdown
Let me walk you through what's happening here.
Step 1: Find the Most Recent Main Chain Entry
The function reads through the transcript file line by line, but it's filtering aggressively:
if (!data.message?.usage) continue; // Skip if no usage data
if (data.isSidechain === true) continue; // Skip sidechain (agent) calls
if (data.isApiErrorMessage === true) continue; // Skip errors
if (!data.timestamp) continue; // Skip if no timestampWhy skip sidechains? When Claude Code spawns agents (like when you use the Task tool), those run in parallel "sidechains." They have their own context that doesn't count toward your main conversation.
Why only the most recent entry? Because the Anthropic API returns cumulative token usage. Each API response includes the total tokens used in that conversation turn—you don't need to sum them up.
Step 2: Calculate Total Input Tokens
Once we have the most recent valid entry, we extract three token types:
return (
(usage.input_tokens || 0) +
(usage.cache_read_input_tokens ?? 0) +
(usage.cache_creation_input_tokens ?? 0)
);Here's what each token type means:
input_tokens: Regular tokens sent to Claude (after the last cache breakpoint)cache_read_input_tokens: Tokens retrieved from cache (90% cheaper!)cache_creation_input_tokens: Tokens written to cache on first use (25% more expensive)
Important: All three count toward your context usage, even though cache reads are way cheaper.
The Percentage Calculation
Now that we have the token count, here's how Claude Code calculates the percentage:
export async function getContextData({
transcriptPath,
maxContextTokens,
autocompactBufferTokens,
useUsableContextOnly = false,
overheadTokens = 0,
}: ContextDataParams): Promise<ContextResult> {
if (!transcriptPath || !existsSync(transcriptPath)) {
return { tokens: 0, percentage: 0 };
}
const contextLength = await getContextLength(transcriptPath);
let totalTokens = contextLength + overheadTokens;
// If useUsableContextOnly is true, add the autocompact buffer to displayed tokens
if (useUsableContextOnly) {
totalTokens += autocompactBufferTokens;
}
// Always calculate percentage based on max context window
const percentage = Math.min(100, (totalTokens / maxContextTokens) * 100);
return {
tokens: totalTokens,
percentage: Math.round(percentage),
};
}The key variables:
maxContextTokens: 200,000 for Claude Sonnet 4.5autocompactBufferTokens: The buffer Claude Code reserves before auto-compacting (roughly 40-45K tokens)overheadTokens: Additional system overheaduseUsableContextOnly: If true, includes the autocompact buffer in the calculation
The formula is simple:
percentage = (totalTokens / maxContextTokens) × 100
But there's a catch—if useUsableContextOnly is enabled, it adds the autocompact buffer to the total. This shows you how much usable context remains before auto-compact kicks in.
What This Means for Your Workflow
Understanding this calculation reveals some important insights:
1. Sidechain agents don't count against your main context
When you launch agents with the Task tool, they run in parallel. Their token usage doesn't eat into your main conversation context. This is huge for complex workflows.
2. Cache tokens still consume context
Even though cache_read_input_tokens cost 90% less, they still occupy space in your context window. Don't confuse pricing with context limits.
3. The autocompact buffer is real
Claude Code reserves ~40-45K tokens as a buffer. Once you hit that threshold, auto-compact triggers. If you're at 95% context usage, you're probably closer to the limit than you think.
Using This in Your Own Tools
Want to build custom monitoring tools? Here's the minimal code you need:
interface TokenUsage {
input_tokens: number;
output_tokens: number;
cache_creation_input_tokens?: number;
cache_read_input_tokens?: number;
}
async function calculateContext(transcriptPath: string): Promise<number> {
const content = await Bun.file(transcriptPath).text();
const lines = content.trim().split("\n");
let mostRecentUsage: TokenUsage | null = null;
let mostRecentTimestamp: Date | null = null;
for (const line of lines) {
const data = JSON.parse(line);
if (
!data.message?.usage ||
data.isSidechain ||
data.isApiErrorMessage ||
!data.timestamp
) {
continue;
}
const entryTime = new Date(data.timestamp);
if (!mostRecentTimestamp || entryTime > mostRecentTimestamp) {
mostRecentTimestamp = entryTime;
mostRecentUsage = data.message.usage;
}
}
if (!mostRecentUsage) return 0;
return (
(mostRecentUsage.input_tokens || 0) +
(mostRecentUsage.cache_read_input_tokens ?? 0) +
(mostRecentUsage.cache_creation_input_tokens ?? 0)
);
}That's it. Read the transcript, find the most recent valid entry, sum the token types.
Conclusion
Claude Code's context calculation is straightforward once you understand the structure:
- Parse the transcript file (JSONL format)
- Filter out sidechains, errors, and entries without usage data
- Find the most recent valid entry (by timestamp)
- Sum all input token types
- Calculate percentage against the max context window
The key insight? Only main chain entries count. Sidechain agents, errors, and tool calls don't affect your primary context.
Now you know exactly how Claude Code tracks your context. Use this knowledge to build better monitoring tools, optimize your workflows, and avoid hitting context limits unexpectedly.
Want to see this in action? Check out my statusline script that displays usage limits in real-time.
What will you build with this? Let me know on Twitter!