How to Calculate Your Claude Code Context Usage
04/11/2025 • Melvynx
Ever wondered how Claude Code tracks your context usage? That percentage showing up in /context isn't magic—it's a specific calculation.
I reverse-engineered the code that powers this feature, and I'm going to show you exactly how it works.
Claude Code stores conversation data in a transcript file (basically a JSONL file—one JSON object per line). Each line represents an event: messages, API calls, tool usage, everything.
The key insight? Not all lines count toward your context.
Here's the code that does the heavy lifting:
export async function getContextLength(
transcriptPath: string,
): Promise<number> {
const content = await Bun.file(transcriptPath).text();
const lines = content.trim().split("\n");
if (lines.length === 0) return 0;
let mostRecentMainChainEntry: TranscriptLine | null = null;
let mostRecentTimestamp: Date | null = null;
for (const line of lines) {
try {
const data = JSON.parse(line) as TranscriptLine;
if (!data.message?.usage) continue;
if (data.isSidechain === true) continue;
if (data.isApiErrorMessage === true) continue;
if (!data.timestamp) continue;
const entryTime = new Date(data.timestamp);
if (!mostRecentTimestamp || entryTime > mostRecentTimestamp) {
mostRecentTimestamp = entryTime;
mostRecentMainChainEntry = data;
}
} catch {}
}
if (!mostRecentMainChainEntry?.message?.usage) {
return 0;
}
const usage = mostRecentMainChainEntry.message.usage;
return (
(usage.input_tokens || 0) +
(usage.cache_read_input_tokens ?? 0) +
(usage.cache_creation_input_tokens ?? 0)
);
}Let me walk you through what's happening here.
The function reads through the transcript file line by line, but it's filtering aggressively:
if (!data.message?.usage) continue; // Skip if no usage data
if (data.isSidechain === true) continue; // Skip sidechain (agent) calls
if (data.isApiErrorMessage === true) continue; // Skip errors
if (!data.timestamp) continue; // Skip if no timestampWhy skip sidechains? When Claude Code spawns agents (like when you use the Task tool), those run in parallel "sidechains." They have their own context that doesn't count toward your main conversation.
Why only the most recent entry? Because the Anthropic API returns cumulative token usage. Each API response includes the total tokens used in that conversation turn—you don't need to sum them up.
Once we have the most recent valid entry, we extract three token types:
return (
(usage.input_tokens || 0) +
(usage.cache_read_input_tokens ?? 0) +
(usage.cache_creation_input_tokens ?? 0)
);Here's what each token type means:
input_tokens: Regular tokens sent to Claude (after the last cache breakpoint)cache_read_input_tokens: Tokens retrieved from cache (90% cheaper!)cache_creation_input_tokens: Tokens written to cache on first use (25% more expensive)Important: All three count toward your context usage, even though cache reads are way cheaper.
Now that we have the token count, here's how Claude Code calculates the percentage:
export async function getContextData({
transcriptPath,
maxContextTokens,
autocompactBufferTokens,
useUsableContextOnly = false,
overheadTokens = 0,
}: ContextDataParams): Promise<ContextResult> {
if (!transcriptPath || !existsSync(transcriptPath)) {
return { tokens: 0, percentage: 0 };
}
const contextLength = await getContextLength(transcriptPath);
let totalTokens = contextLength + overheadTokens;
// If useUsableContextOnly is true, add the autocompact buffer to displayed tokens
if (useUsableContextOnly) {
totalTokens += autocompactBufferTokens;
}
// Always calculate percentage based on max context window
const percentage = Math.min(100, (totalTokens / maxContextTokens) * 100);
return {
tokens: totalTokens,
percentage: Math.round(percentage),
};
}The key variables:
maxContextTokens: 200,000 for Claude Sonnet 4.5autocompactBufferTokens: The buffer Claude Code reserves before auto-compacting (roughly 40-45K tokens)overheadTokens: Additional system overheaduseUsableContextOnly: If true, includes the autocompact buffer in the calculationThe formula is simple:
percentage = (totalTokens / maxContextTokens) × 100
But there's a catch—if useUsableContextOnly is enabled, it adds the autocompact buffer to the total. This shows you how much usable context remains before auto-compact kicks in.
Understanding this calculation reveals some important insights:
1. Sidechain agents don't count against your main context
When you launch agents with the Task tool, they run in parallel. Their token usage doesn't eat into your main conversation context. This is huge for complex workflows.
2. Cache tokens still consume context
Even though cache_read_input_tokens cost 90% less, they still occupy space in your context window. Don't confuse pricing with context limits.
3. The autocompact buffer is real
Claude Code reserves ~40-45K tokens as a buffer. Once you hit that threshold, auto-compact triggers. If you're at 95% context usage, you're probably closer to the limit than you think.
Want to build custom monitoring tools? Here's the minimal code you need:
interface TokenUsage {
input_tokens: number;
output_tokens: number;
cache_creation_input_tokens?: number;
cache_read_input_tokens?: number;
}
async function calculateContext(transcriptPath: string): Promise<number> {
const content = await Bun.file(transcriptPath).text();
const lines = content.trim().split("\n");
let mostRecentUsage: TokenUsage | null = null;
let mostRecentTimestamp: Date | null = null;
for (const line of lines) {
const data = JSON.parse(line);
if (
!data.message?.usage ||
data.isSidechain ||
data.isApiErrorMessage ||
!data.timestamp
) {
continue;
}
const entryTime = new Date(data.timestamp);
if (!mostRecentTimestamp || entryTime > mostRecentTimestamp) {
mostRecentTimestamp = entryTime;
mostRecentUsage = data.message.usage;
}
}
if (!mostRecentUsage) return 0;
return (
(mostRecentUsage.input_tokens || 0) +
(mostRecentUsage.cache_read_input_tokens ?? 0) +
(mostRecentUsage.cache_creation_input_tokens ?? 0)
);
}That's it. Read the transcript, find the most recent valid entry, sum the token types.
Claude Code's context calculation is straightforward once you understand the structure:
The key insight? Only main chain entries count. Sidechain agents, errors, and tool calls don't affect your primary context.
Now you know exactly how Claude Code tracks your context. Use this knowledge to build better monitoring tools, optimize your workflows, and avoid hitting context limits unexpectedly.
Want to see this in action? Check out my statusline script that displays usage limits in real-time.
What will you build with this? Let me know on Twitter!