FIX Apr 27, 2026 9 min read

Claude Code Slow Response: 7 Fixes That Actually Work

TL;DR Most slow responses come from bloated conversation context — use /clear or /compact to reset. Model choice matters: Opus is slower than Sonnet; use /model to switch mid-session. MCP…

by Bugi 9 min
TL;DR

  • Most slow responses come from bloated conversation context — use /clear or /compact to reset.
  • Model choice matters: Opus is slower than Sonnet; use /model to switch mid-session.
  • MCP servers, large file reads, and network proxies add latency that looks like a Claude Code problem but isn’t.

Overview

Claude Code responses slow to a crawl mid-session. You type a command, wait 15–30 seconds, and get back a partial answer or a timeout. This affects developers on all platforms — macOS, Linux, and Windows — and typically worsens the longer a session runs.

The root cause is rarely Claude Code itself. It’s almost always one of: context window bloat, model selection overhead, network configuration, or external tool latency. This article covers 7 tested fixes, ordered from most common to most niche.

What Causes Slow Responses

Claude Code sends your conversation history, file contents, and tool results to Claude’s API on every request. The response time depends on:

  • Input size. More tokens in = longer processing time. A fresh session responds in 2–5 seconds. A session with 50+ tool calls and multiple large file reads can take 20–40 seconds.
  • Output length. Longer generated responses take proportionally longer to stream.
  • Model. Opus processes slower than Sonnet. Sonnet processes slower than Haiku.
  • Network path. Proxies, VPNs, and DNS resolution add round-trip latency.
  • External tools. MCP servers, shell commands, and file operations block the response pipeline.

There’s no single error message. The symptom is consistent: responses that were fast at session start degrade over time, or specific operations (large file reads, complex refactors) reliably take longer than expected.

Note
Claude Code automatically compresses prior messages as context approaches the limit. If you’re hitting slowdowns, compression is likely already happening — which itself costs time.

Solution 1: Clear or Compact the Conversation

This fixes the majority of slow response issues. Every message, tool call, and file read accumulates in context. After 20–30 interactions, you’re sending tens of thousands of tokens per request.

Option A — Full reset:

/clear

Wipes conversation history. You lose all context but get a fresh, fast session. Best when switching tasks.

Option B — Compress in place:

/compact

Summarizes the conversation into a shorter representation, preserving key context while reducing token count. Use this when you need continuity but responses have slowed.

Option C — Targeted compaction with instructions:

/compact keep the current file paths and error messages

Passing a prompt to /compact tells the summarizer what to prioritize. Useful when you’re mid-debug and need specific details preserved.

Tip
Run /compact proactively every 15–20 interactions, not just when things get slow. Prevention beats cure.

Monitor your context usage with /cost — it shows token consumption for the current session. If input tokens are climbing past 100k, it’s time to compact or clear.

Solution 2: Switch to a Faster Model

Claude Code defaults to the model configured in your settings, but you can switch mid-session. Opus delivers the highest quality but processes noticeably slower than Sonnet.

/model sonnet

Or toggle fast mode, which optimizes for speed on the same model:

/fast

For routine tasks — file searches, simple edits, test runs — Sonnet handles them at a fraction of the latency. Reserve Opus for complex architectural decisions, large refactors, or multi-file reasoning where quality matters more than speed.

You can also set the default model in your project’s .claude/settings.json:

{
  "model": "sonnet"
}
Tip
Use /model with no arguments to see which model is currently active. Unexpected Opus usage is a common cause of “it was fast yesterday.”

Solution 3: Reduce File Read Size

When Claude Code reads a file, the entire content goes into context. A single 2000-line file adds thousands of tokens. Read three large files and you’ve consumed a significant chunk of the context window — every subsequent request now carries that weight.

Limit what you read. Instead of asking Claude Code to “read the whole file,” point it to specific sections:

Read lines 50-120 of src/server.ts

Use .claudeignore. Create a .claudeignore file in your project root to exclude directories that Claude Code doesn’t need to index:

node_modules/
dist/
build/
*.min.js
coverage/
.git/

This reduces the initial codebase scan time and prevents accidental reads of large generated files.

Redirect large outputs. When running shell commands that produce verbose output, redirect to a file and read selectively:

npm test > /tmp/test-output.txt 2>&1
# Then ask Claude to grep for failures instead of reading the full log
Takeaway

Every byte in context costs latency on every subsequent request. Be surgical about what enters the conversation.

Solution 4: Audit MCP Server Latency

MCP (Model Context Protocol) servers extend Claude Code’s capabilities, but each one adds latency. When Claude Code starts, it initializes all configured MCP servers. Slow servers block the startup and can delay tool calls during the session.

Check your MCP configuration:

cat ~/.claude/settings.json | grep -A 20 "mcpServers"

Diagnose slow servers. Temporarily disable all MCP servers and see if response times improve:

  1. Rename your settings file: mv ~/.claude/settings.json ~/.claude/settings.json.bak
  2. Start Claude Code and test response speed
  3. Re-enable servers one at a time to identify the bottleneck

Common culprits:

  • Database MCP servers that run queries on every tool call
  • Web-fetching servers blocked by DNS or proxy issues
  • Servers with heavy initialization (loading large indexes, downloading models)
Warning
A single unresponsive MCP server can make Claude Code appear frozen during startup. Check server health before blaming Claude Code itself.

If you identify a slow server, either optimize its implementation or move it to on-demand initialization rather than startup.

Solution 5: Fix Network and Proxy Issues

Claude Code communicates with Anthropic’s API over HTTPS. Anything between your machine and api.anthropic.com adds latency.

Check basic connectivity:

~/project

$ curl -o /dev/null -s -w "time_total: %{time_total}s\n" https://api.anthropic.com
time_total: 0.245s

Under 500ms is normal. Over 1 second suggests a network issue.

Corporate proxy or VPN. If you’re behind a proxy, ensure Claude Code can reach the API:

export HTTPS_PROXY=http://your-proxy:8080
export HTTP_PROXY=http://your-proxy:8080

Some VPNs route API traffic through distant servers. Test with VPN disconnected to confirm.

DNS resolution. Slow DNS adds latency to every API call:

dig api.anthropic.com +stats | grep "Query time"

If query time exceeds 100ms, switch to a faster DNS resolver (1.1.1.1, 8.8.8.8) or add a static entry to /etc/hosts.

Solution 6: Manage System Resources

Claude Code runs Node.js locally. While it doesn’t consume massive resources itself, it competes with your editor, build tools, and other processes for CPU and memory.

Check resource usage during a slow response:

# CPU and memory snapshot
top -l 1 | head -20    # macOS
top -bn1 | head -20    # Linux

Specific scenarios that cause local slowdowns:

  • TypeScript language server consuming 2–4 GB alongside VS Code
  • Docker containers running resource-intensive builds
  • Webpack/Vite dev server in watch mode during large rebuilds
  • Multiple Claude Code sessions running in parallel

Each Claude Code instance maintains its own conversation context in memory. Running three sessions simultaneously triples the local memory footprint.

1
Check running sessions
Run ps aux | grep claude to see active instances.
2
Close idle sessions
Use /exit in sessions you’re not actively using.
3
Monitor during operations
Watch memory with htop while running large refactors.

Solution 7: Update Claude Code

Older versions may have performance regressions or miss optimizations shipped in recent releases. Claude Code updates frequently.

# Check current version
claude --version

# Update to latest
npm update -g @anthropic-ai/claude-code

If you installed via Homebrew:

brew upgrade claude-code

Release notes are published on the official changelog. Performance improvements — especially around context management and streaming — ship regularly.

Note
After updating, start a fresh session with /clear. Some updates change the context format, and old session data can cause unexpected behavior.

Still Not Working?

If none of these fixes resolve the slowdown:

  • Check Anthropic’s status page for ongoing API incidents or degraded performance.
  • File an issue on the Claude Code GitHub repository with your OS, Claude Code version (claude --version), and a description of when slowdowns occur.
  • Try the web interface. If claude.ai responds normally but Claude Code is slow, the issue is local (network, MCP, resources). If both are slow, it’s API-side.
  • Review the Claude Code logs. Enable verbose logging with claude --verbose to identify which specific operation is taking the longest.

FAQ

Why does Claude Code get slower the longer I use it?
Every message, tool result, and file read accumulates in the conversation context. More input tokens means longer processing time per request. Use /compact periodically or /clear to reset.
Does the model I choose affect response speed?
Yes. Opus is the most capable but slowest model. Sonnet is faster for most tasks. Use /model to switch mid-session or /fast to toggle fast mode on the current model.
Can MCP servers cause Claude Code to freeze on startup?
Yes. Claude Code initializes all configured MCP servers at startup. An unresponsive server can block the entire startup sequence. Disable servers temporarily to diagnose.
How do I check my Claude Code context usage?
Run /cost in your session. It shows token consumption including input and output tokens. High input token counts correlate directly with slower responses.
Is there a way to prevent Claude Code from reading large files?
Create a .claudeignore file in your project root. It works like .gitignore — list directories and file patterns that Claude Code should skip during codebase indexing and file reads.
Will running multiple Claude Code sessions slow things down?
Locally, yes — each session consumes memory for its conversation context. API-side, concurrent sessions share your rate limit, which can cause queuing under heavy use. Close idle sessions with /exit.
How often should I update Claude Code?
Check weekly. Claude Code ships performance and stability improvements frequently. Run npm update -g @anthropic-ai/claude-code to get the latest version.
Why does Claude Code get slower the longer I use it?
Every message, tool result, and file read accumulates in the conversation context. More input tokens means longer processing time per request. Use /compact periodically or /clear to reset.
Does the model I choose affect response speed?
Yes. Opus is the most capable but slowest model. Sonnet is faster for most tasks. Use /model to switch mid-session or /fast to toggle fast mode on the current model.
Can MCP servers cause Claude Code to freeze on startup?
Yes. Claude Code initializes all configured MCP servers at startup. An unresponsive server can block the entire startup sequence. Disable servers temporarily to diagnose.
How do I check my Claude Code context usage?
Run /cost in your session. It shows token consumption including input and output tokens. High input token counts correlate directly with slower responses.
Is there a way to prevent Claude Code from reading large files?
Create a .claudeignore file in your project root. It works like .gitignore — list directories and file patterns that Claude Code should skip during codebase indexing and file reads.
Will running multiple Claude Code sessions slow things down?
Locally, yes — each session consumes memory for its conversation context. API-side, concurrent sessions share your rate limit, which can cause queuing under heavy use. Close idle sessions with /exit.
How often should I update Claude Code?
Check weekly. Claude Code ships performance and stability improvements frequently. Run npm update -g @anthropic-ai/claude-code to get the latest version.