mcp-context-budget
Measure and enforce MCP tool-surface budgets before your coding agent starts.
Open source (MIT) · Local-first · Dependency-free core · CI-enforceable.
The problem
MCP servers quietly inflate an agent's tool surface — many tools, thousands of tokens of schemas — bloating context, slowing runs, and raising cost before the agent does any real work, with no static way to see or cap it.
A local-first CLI that scans your MCP config, measures the token budget each server and tool adds, selects a lean task-relevant tool set, and ENFORCES it in CI — failing the build when the surface exceeds budget. No runtime service, no proxy, nothing leaves your machine.
Quickstart
Not on PyPI — install from source or run the Docker image. The core CLI has no external runtime dependency.
Install from source (Requires Python 3.11+):
git clone https://github.com/OrionArchitekton/mcp-context-budget
cd mcp-context-budget
python3.11 -m venv .venv && . .venv/bin/activate
pip install -e '.[dev]'Command surface
- scan: Estimate schema and response-token cost from an MCP config or tools/list fixture; emit a report and a lockfile.
- select: Pick a smaller task-relevant tool set with deterministic SQLite FTS5/BM25 ranking, under max-tools and max-schema-tokens.
- semantic-select: Rank tools by embedding similarity (deterministic fixture mode, or optional local Ollama) before applying the budget caps.
- check: Re-validate a lockfile against schema and response budgets — the CI gate that fails the build on a regression.
- compress-responses: Deterministically compress recorded response fixtures under a response budget, with before/after proof.
- config-apply: Turn a selected-tool lock into a safe local MCP config patch — dry-run by default, write requires --write and makes a backup.
- config-audit: Read-only hygiene check that flags plaintext secrets in MCP config files without ever printing the values.
- export: Export the budget result (e.g. SARIF) for code-scanning and CI surfacing.
Why it is different
- Local-first: No runtime service, no proxy, no hosted dashboard. The CLI runs on your machine and nothing leaves it. Semantic ranking can optionally call a local Ollama — only when you explicitly ask for it.
- Dependency-free core: The core package ships with zero Python dependencies. It scans, measures, selects, and enforces using the standard library — easy to vendor, audit, and trust in a build pipeline.
- CI-enforceable: A lockfile plus a check command turns "our context is bloated" into a build gate. The check exits non-zero when the schema or response budget regresses, so the surface stops growing silently.
- Honest, never a false PASS: config-apply binds each lock to its config by fingerprint and reports PARTIAL (not a fake PASS) when a command-discovered server cannot be statically enforced. Secret audits redact values; reports never print literal secrets.