Context Tax

The single biggest agent design problem: every tool definition the model sees costs tokens. Not when called — just to be available. The GitHub MCP loads ~50K tokens before the user types a word. Connect three or four full-spec MCPs and you’ve burned 30–40% of the model’s context window before any work begins.

This is the context tax, and it caps how many MCPs an agent can usefully carry.

The naïve approach: fewer tools

The obvious response — “use fewer MCPs” — is the wrong tradeoff. You give up capability to save tokens. The agent that could have answered your question can’t, because the relevant tool wasn’t loaded.

How Pipeworx avoids the tax

A single Pipeworx connection wraps 1,200+ tools across 350+ packs, but the agent never carries all of them. Three mechanisms:

1. Pre-filter at the gateway URL

gateway.pipeworx.io/mcp?task=housing+market

The ?task= parameter does semantic search over the catalog at connect time. The agent’s tools/list returns only the ~20 most relevant tools. Cuts cold-start context by ~95%.

2. On-demand discovery via `discover_tools`

When the prefilter is too narrow or the task shifts mid-session, the agent calls:

discover_tools({ query: "find federal contracts for cybersecurity" })

It returns the top 20 matches with names + descriptions. The agent can then call them by name without re-listing.

3. Compound tools that collapse N calls into 1

The _intel tools bundle 3–5 calls into one:

fintech_company_deep_dive({ ticker: "AAPL" })
// → SEC filings + stock quote + income statement + CFPB complaints, in one response

Same data, one round-trip’s worth of context instead of five.

What this means for your design

Default connection (gateway.pipeworx.io/mcp) loads 5 meta-tools + the agent can call discover_tools for the rest. Total cold cost: ~3K tokens.
Task-scoped (?task=housing+market) loads 5 meta-tools + ~20 housing tools. ~8K tokens.
Compound calls keep transcripts short — one tool call carrying five tools’ worth of data is one entry in the agent’s history, not five.

Adjacent issues

Tool-list churn. If a gateway re-issues different tool sets per turn, the model gets confused about what’s available. Pipeworx keeps tool-lists stable within a session.
Lost in the middle. Past ~100K tokens, model performance drops sharply on retrieval. Pipeworx’s _meta.cache.fresh_until lets agents detect when they’re re-reading their own cached data.
Output schema reduces re-listing. Every tool’s outputSchema tells the agent the response shape before calling — no test calls just to learn what’s coming back.

Meta-tools — how ask_pipeworx, discover_tools, resolve_entity, compare_entities work
Resources — citing entities by URI without re-fetching
Prompts — server-side workflow templates

Context Tax

The naïve approach: fewer tools

How Pipeworx avoids the tax

1. Pre-filter at the gateway URL

2. On-demand discovery via `discover_tools`

3. Compound tools that collapse N calls into 1

What this means for your design

Adjacent issues

Read more

Was this helpful?

Context Tax

The naïve approach: fewer tools

How Pipeworx avoids the tax

1. Pre-filter at the gateway URL

2. On-demand discovery via discover_tools

3. Compound tools that collapse N calls into 1

What this means for your design

Adjacent issues

Read more

Was this helpful?

2. On-demand discovery via `discover_tools`