Best MCP Servers for Claude Code

The best MCP servers for Claude Code, ranked by usefulness, real token cost and setup effort — including which ones quietly inflate every turn, and one that cuts that cost.

Profile photo of Paul Irolla

By Paul Irolla

Founder · AI & developer tools · Tokenade

Ph.D. in AI · builds token-optimization tooling for AI coding agents

View author page
9 min read
Cite this page

Which MCP servers are actually worth connecting to Claude Code?

The short answer: fewer than most lists suggest, and only if they earn their keep on every session. Every MCP server you connect injects its full tool manifest — every tool name, description, and parameter schema — into the context of every single turn you send, whether or not you call that tool. That overhead stacks. Measurements on the GitHub MCP server, which currently ships around 93 tools, put its manifest at roughly 55,000 tokens — about 21% of Claude's 200K context window paid before your first message. This is the trade-off almost every "best MCP servers" round-up ignores. You want capability without burning the context you actually code in. So this ranking is built around one question: does the value this server adds outweigh the token overhead it costs? TL;DR — the picks:
  1. GitHub — keeps the PR/issue loop inside the agent; worth the overhead if you live in GitHub.
  2. Context7 — pulls version-correct library docs on demand; situational but high-value.
  3. Playwright — the only way to give Claude Code a real browser.
  4. Tokenade — the one server on this list that reduces token cost instead of adding to it.
  5. Fetch — a lightweight web-content lookup; low overhead, low setup.
  6. Sequential Thinking — a structured reasoning scratchpad; earns it on genuinely complex problems.

How we ranked these

Every server is scored on four criteria:
  • Usefulness — how often it's the right tool during a real Claude Code session, not a demo.
  • Token cost — does it add per-turn manifest and output overhead, stay roughly neutral, or actively reduce tokens? This is the criterion most lists omit entirely.
  • Setup effort — how much friction to install, authenticate, and keep running.
  • Maintenance — how stable it is in practice; does it break when upstream APIs change?
A server only makes the list if its usefulness clearly outweighs the token overhead it adds. Servers that add overhead comparable to their value (or where the same job is better done another way) are called out rather than included to fill a round number.

1. GitHub — repo context without leaving the agent

GitHub MCP is the best pick for anyone whose Claude Code workflow revolves around pull requests, code review, and issue triage — it collapses "find that PR, read that diff, open that issue" into a single agent turn. The official server from GitHub covers repositories, PRs, issues, code search, and comments, which means the read-act-verify loop for review work stays entirely inside the agent.
  • What it is: the official GitHub MCP server (github/github-mcp-server), maintained by GitHub.
  • Strengths: covers the full PR/issue lifecycle; deep repo navigation; keeps context-switching out of your workflow.
  • Limitations: its manifest is large — around 55,000 tokens for ~93 tools at full load. GitHub cut that roughly in half with a January 2026 consolidation of the Projects toolset, but it remains one of the heavier servers available. Unfiltered issue and PR body dumps can further inflate output tokens. Use scope filtering (only enable the OAuth scopes and toolsets you actually need) to trim it.
  • Best for: developers living in GitHub — PR review, issue triage, code search, and repo context lookups.

2. Context7 — current library docs on demand

Context7 is the right add when Claude keeps reaching for outdated APIs or hallucinating method signatures — it pulls version-correct documentation for the library you're using directly into context, on demand. Built by Upstash, it exposes two tools: resolve-library-id (map a library name to a Context7 ID) and get-library-docs (fetch current, version-specific docs). You can use it without an API key at basic rate limits, or grab a free key at context7.com for higher throughput.
  • What it is: a docs-retrieval MCP server by Upstash that indexes thousands of popular libraries (Next.js, React, MongoDB, Supabase, and many more) and serves clean, version-aware documentation snippets on request.
  • Strengths: directly attacks the "outdated API" hallucination — the model gets current docs, not training-data-vintage examples. Doc payloads are pulled per-request, not pre-loaded, which keeps the base manifest small.
  • Limitations: only earns its place on sessions where you're actively working against fast-moving libraries. The doc payloads it injects do add output tokens when called; a full library doc fetch can run several thousand tokens depending on the library section. For a stable internal codebase the value is low.
  • Best for: working with actively-maintained external libraries where the API surface changes frequently.

3. Playwright — a real browser inside Claude Code

Playwright is the top choice for front-end and end-to-end testing from inside Claude Code — it gives the agent genuine browser control to navigate, click, assert, and verify its own UI changes. The official server from Microsoft (@playwright/mcp) drives a real browser via accessibility snapshots, which means the agent can verify that a component actually renders, not just that the code compiles.
  • What it is: Microsoft's official Playwright MCP server, exposing 25–40+ browser control tools (navigation, forms, network inspection, snapshots, and more).
  • Strengths: genuine browser automation; accessibility-snapshot mode avoids the token cost of raw screenshots; persistent sessions preserve login state between runs; supports Chrome, Firefox, WebKit, and Edge.
  • Limitations: the tool manifest is large (25–40 tools), contributing meaningful per-turn overhead. Microsoft has also released @playwright/cli as a companion that achieves the same operations via shell commands at roughly 4× fewer tokens than the MCP path — if you're only doing test assertion work, the CLI route may be cheaper. The server itself is also heavier to set up than most: it requires a browser binary and benefits from a kept-alive session.
  • Best for: front-end developers and anyone doing E2E test work who needs the agent to verify visual and interactive behaviour, not just code correctness.

4. Tokenade — the server that reduces tokens instead of adding them

Tokenade is the only server on this list that actively shrinks what Claude Code spends per turn, by adding semantic code search, structured output compaction, skeleton-first file reads, and lazy tool loading — instead of adding to the manifest tax. Where every other server here makes the overhead problem slightly worse, Tokenade addresses it directly. The mechanism: instead of loading all MCP tool definitions into the manifest on every turn, Tokenade exposes a compact search interface — the agent asks for a tool by intent, and only the matching definition is loaded. Compound that with output compaction (noisy build logs, git output, and CLI dumps are filtered down to the signal before entering context) and structure-first file reads (signatures and exports rather than whole-file bodies), and the stacked effect across a real session is substantial.
  • What it is: a single binary that installs into Claude Code in one command and exposes a suite of token-saving MCP tools — semantic code search, output filtering, skeleton compression, and lazy tool loading — plus a token-savings dashboard so you can see the actual numbers. Freemium: free up to ~20M tokens (no card required), then Pro at €9.90/month TTC (France) / $9.90/month excl. tax (US).
  • Strengths: stacks with the other servers on this list — it makes a multi-server setup more affordable by trimming the overhead they add. Zero per-prompt effort; the savings are automatic. The dashboard makes the impact visible rather than theoretical.
  • Limitations: Tokenade doesn't add an external capability (no GitHub integration, no browser, no docs retrieval). It optimises what the agent already does; if your sessions are short and your bill isn't a concern yet, the payoff is smaller. Like all output-compaction tools, it trades some verbosity for efficiency — if you need a raw unfiltered transcript for debugging, you can bypass it with tokenade raw <cmd> for a one-shot escape.
  • Best for: any Claude Code user whose token bill or rate limits are starting to hurt, especially after adding multiple MCP servers. The value scales with session length and MCP surface area.

5. Fetch — clean web content without the HTML noise

Fetch is the simplest way to pull readable web content into Claude Code — it converts a page to markdown instead of raw HTML, making reference lookups cheap and noise-free. The official server from Anthropic's MCP project is a single-tool server: it fetches a URL, strips HTML structure, and returns clean markdown the model can read without wading through tags and scripts.
  • What it is: the official mcp-server-fetch (from modelcontextprotocol/servers), a Python-based single-tool server that converts web pages to markdown on request.
  • Strengths: minimal manifest (one tool, low token overhead); markdown output is far cheaper than raw HTML for the same content; chunked retrieval via start_index means it can page through long documents rather than dumping everything at once. Respects robots.txt by default.
  • Limitations: single-page fetch only — it's a lookup tool, not a crawler. The content it returns still costs tokens proportional to the page length; a dense documentation page can be several thousand tokens even as markdown. A few sites actively block simple fetch requests.
  • Best for: quick one-off reference lookups, pulling a changelog or README into context, fetching API docs that aren't indexed by Context7.

6. Sequential Thinking — a scratchpad for hard problems

Sequential Thinking earns a place when the task is genuinely complex and multi-step — it gives Claude Code a structured scratchpad to plan, revise, and branch its reasoning before committing to an action. The official server from the MCP project exposes a single sequential_thinking tool that lets the model work through a problem in explicit, revisable steps.
  • What it is: the official sequentialthinking server from modelcontextprotocol/servers, providing structured step-by-step reasoning as an MCP tool.
  • Strengths: improves correctness on complex, branching problems where a flat answer is likely to miss a case — architecture decisions, debugging non-obvious failures, multi-constraint planning.
  • Limitations: the tool adds tokens via the reasoning steps themselves; on simple tasks it's overkill and will inflate the session for no quality gain. It also doesn't do anything Claude can't do with good prompting — it's a structural nudge, not a qualitative upgrade. Use it selectively.
  • Best for: complex, multi-step planning tasks — system design, subtle debugging, refactoring decisions with many constraints. Leave it off for routine coding work.

At a glance

ServerUsefulnessToken costSetupBest for
GitHubHighAdds (large)MediumPR/issue/code-review work
Context7SituationalAdds (per call)EasyFast-moving library docs
PlaywrightSituationalAdds (large)HarderE2E testing + UI verification
TokenadeHighReducesEasyCutting the MCP overhead bill
FetchHandyLowEasyOne-off web lookups
Sequential ThinkingSituationalAdds (per use)EasyComplex multi-step planning

How to choose

Start with at most two or three servers that match your actual workflow — the ones you'll use on most sessions, not all sessions. The manifest overhead is paid on every turn whether the tools fire or not, so a server you use once a week costs you tokens every day. A practical starting point for most solo developers: add GitHub if you review PRs in Claude Code, add Context7 if you're working against libraries that change often, and add Playwright if you're doing front-end or E2E work. Pick up Fetch for reference lookups; it's cheap enough to leave on. Hold off on Sequential Thinking unless you have a specific hard-reasoning use case in mind. Then, because each server you add inflates your per-turn cost, add the layer that pushes it back down. That's what Tokenade does: it intercepts the manifest on every turn and loads tools lazily, compacts noisy output before it enters context, and lets the agent search the codebase by meaning rather than by dumping files. The result is that a four- or five-server setup costs roughly what a one-server setup did before. See How to reduce AI coding agent token usage for a full breakdown of every lever, or Reduce Claude Code token usage for the Claude Code-specific version.
See also:

Up to 88% fewer tokens. Zero config.

Tokenade is the simplest way to cut what your coding agent sends to the model — set it up once, save on every prompt. Works with Claude Code, Cursor, Codex, Copilot & more.