How to Calculate Your Claude Code Context Usage

04/11/2025 • Melvynx

Ever wondered how Claude Code tracks your context usage? That percentage showing up in /context isn't magic—it's a specific calculation.

I reverse-engineered the code that powers this feature, and I'm going to show you exactly how it works.

What I Found

Claude Code stores conversation data in a transcript file (basically a JSONL file—one JSON object per line). Each line represents an event: messages, API calls, tool usage, everything.

The key insight? Not all lines count toward your context.

Here's the code that does the heavy lifting:

TYPESCRIPT

export async function getContextLength(
  transcriptPath: string,
): Promise<number> {
  const content = await Bun.file(transcriptPath).text();
  const lines = content.trim().split("\n");

  if (lines.length === 0) return 0;

  let mostRecentMainChainEntry: TranscriptLine | null = null;
  let mostRecentTimestamp: Date | null = null;

  for (const line of lines) {
    try {
      const data = JSON.parse(line) as TranscriptLine;

      if (!data.message?.usage) continue;
      if (data.isSidechain === true) continue;
      if (data.isApiErrorMessage === true) continue;
      if (!data.timestamp) continue;

      const entryTime = new Date(data.timestamp);

      if (!mostRecentTimestamp || entryTime > mostRecentTimestamp) {
        mostRecentTimestamp = entryTime;
        mostRecentMainChainEntry = data;
      }
    } catch {}
  }

  if (!mostRecentMainChainEntry?.message?.usage) {
    return 0;
  }

  const usage = mostRecentMainChainEntry.message.usage;

  return (
    (usage.input_tokens || 0) +
    (usage.cache_read_input_tokens ?? 0) +
    (usage.cache_creation_input_tokens ?? 0)
  );
}

The Logic Breakdown

Let me walk you through what's happening here.

Step 1: Find the Most Recent Main Chain Entry

The function reads through the transcript file line by line, but it's filtering aggressively:

TYPESCRIPT

if (!data.message?.usage) continue; // Skip if no usage data
if (data.isSidechain === true) continue; // Skip sidechain (agent) calls
if (data.isApiErrorMessage === true) continue; // Skip errors
if (!data.timestamp) continue; // Skip if no timestamp

Why skip sidechains? When Claude Code spawns agents (like when you use the Task tool), those run in parallel "sidechains." They have their own context that doesn't count toward your main conversation.

Why only the most recent entry? Because the Anthropic API returns cumulative token usage. Each API response includes the total tokens used in that conversation turn—you don't need to sum them up.

Step 2: Calculate Total Input Tokens

Once we have the most recent valid entry, we extract three token types:

TYPESCRIPT

return (
  (usage.input_tokens || 0) +
  (usage.cache_read_input_tokens ?? 0) +
  (usage.cache_creation_input_tokens ?? 0)
);

Here's what each token type means:

input_tokens: Regular tokens sent to Claude (after the last cache breakpoint)
cache_read_input_tokens: Tokens retrieved from cache (90% cheaper!)
cache_creation_input_tokens: Tokens written to cache on first use (25% more expensive)

Important: All three count toward your context usage, even though cache reads are way cheaper.

The Percentage Calculation

Now that we have the token count, here's how Claude Code calculates the percentage:

TYPESCRIPT

export async function getContextData({
  transcriptPath,
  maxContextTokens,
  autocompactBufferTokens,
  useUsableContextOnly = false,
  overheadTokens = 0,
}: ContextDataParams): Promise<ContextResult> {
  if (!transcriptPath || !existsSync(transcriptPath)) {
    return { tokens: 0, percentage: 0 };
  }

  const contextLength = await getContextLength(transcriptPath);
  let totalTokens = contextLength + overheadTokens;

  // If useUsableContextOnly is true, add the autocompact buffer to displayed tokens
  if (useUsableContextOnly) {
    totalTokens += autocompactBufferTokens;
  }

  // Always calculate percentage based on max context window
  const percentage = Math.min(100, (totalTokens / maxContextTokens) * 100);

  return {
    tokens: totalTokens,
    percentage: Math.round(percentage),
  };
}

The key variables:

maxContextTokens: 200,000 for Claude Sonnet 4.5
autocompactBufferTokens: The buffer Claude Code reserves before auto-compacting (roughly 40-45K tokens)
overheadTokens: Additional system overhead
useUsableContextOnly: If true, includes the autocompact buffer in the calculation

The formula is simple:

percentage = (totalTokens / maxContextTokens) × 100

But there's a catch—if useUsableContextOnly is enabled, it adds the autocompact buffer to the total. This shows you how much usable context remains before auto-compact kicks in.

What This Means for Your Workflow

Understanding this calculation reveals some important insights:

1. Sidechain agents don't count against your main context

When you launch agents with the Task tool, they run in parallel. Their token usage doesn't eat into your main conversation context. This is huge for complex workflows.

2. Cache tokens still consume context

Even though cache_read_input_tokens cost 90% less, they still occupy space in your context window. Don't confuse pricing with context limits.

3. The autocompact buffer is real

Claude Code reserves ~40-45K tokens as a buffer. Once you hit that threshold, auto-compact triggers. If you're at 95% context usage, you're probably closer to the limit than you think.

Using This in Your Own Tools

Want to build custom monitoring tools? Here's the minimal code you need:

TYPESCRIPT

interface TokenUsage {
  input_tokens: number;
  output_tokens: number;
  cache_creation_input_tokens?: number;
  cache_read_input_tokens?: number;
}

async function calculateContext(transcriptPath: string): Promise<number> {
  const content = await Bun.file(transcriptPath).text();
  const lines = content.trim().split("\n");

  let mostRecentUsage: TokenUsage | null = null;
  let mostRecentTimestamp: Date | null = null;

  for (const line of lines) {
    const data = JSON.parse(line);

    if (
      !data.message?.usage ||
      data.isSidechain ||
      data.isApiErrorMessage ||
      !data.timestamp
    ) {
      continue;
    }

    const entryTime = new Date(data.timestamp);
    if (!mostRecentTimestamp || entryTime > mostRecentTimestamp) {
      mostRecentTimestamp = entryTime;
      mostRecentUsage = data.message.usage;
    }
  }

  if (!mostRecentUsage) return 0;

  return (
    (mostRecentUsage.input_tokens || 0) +
    (mostRecentUsage.cache_read_input_tokens ?? 0) +
    (mostRecentUsage.cache_creation_input_tokens ?? 0)
  );
}

That's it. Read the transcript, find the most recent valid entry, sum the token types.

Conclusion

Claude Code's context calculation is straightforward once you understand the structure:

Parse the transcript file (JSONL format)
Filter out sidechains, errors, and entries without usage data
Find the most recent valid entry (by timestamp)
Sum all input token types
Calculate percentage against the max context window

The key insight? Only main chain entries count. Sidechain agents, errors, and tool calls don't affect your primary context.

Now you know exactly how Claude Code tracks your context. Use this knowledge to build better monitoring tools, optimize your workflows, and avoid hitting context limits unexpectedly.

Want to see this in action? Check out my statusline script that displays usage limits in real-time.

What will you build with this? Let me know on Twitter!