Pipeworx vs Exa

structured primary-source records vs neural web search for AI

Pipeworx is for

structured, citable records from 1,352 authoritative sources — the filing itself, not pages about it.

Exa is for

neural web search built for AI — finding and extracting from web pages at scale, plus Websets for structured web research.

Exa is the strongest of the search-APIs-for-AI: neural retrieval over the web, content extraction, and Websets for turning web research into structured tables. Pipeworx starts one layer deeper: instead of searching pages ABOUT a thing, it returns the authoritative record OF the thing — the SEC filing, the FRED observation, the FDA entry — as structured JSON with provenance metadata and a stable pipeworx:// citation. Web search is the right tool when the answer lives on arbitrary web pages; a data gateway is the right tool when the answer lives in a known authoritative system. Most capable agents need both — and the failure mode we exist to prevent is reaching for web search when a primary source exists.

Side-by-side

	Pipeworx	Exa
Unit of retrieval	The authoritative record — structured JSON from the source system	Web pages/passages — neural search + extraction
Provenance	_meta.source + fetched_at + stable pipeworx:// citation URI on every response	Result URLs; no provenance/citation scheme
Grounded answering	ask_pipeworx_grounded — extractive answer + verbatim evidence + explicit refusal when the data doesn't say	Returns search results; answering is your model's job
Structured output	Native — every tool returns typed JSON with outputSchema	Websets builds structured tables from web research
Monitoring	Data-event subscriptions (8-Ks, FRED, markets, patents, trials) with push delivery	Not an event-subscription product
Interface	MCP gateway (one URL, NL router) + REST	API + MCP servers for search/Websets

When to use which

Use Exa if

The answer lives on arbitrary web pages — news commentary, company sites, long-tail content
You need web-scale retrieval and extraction as a building block
You're assembling research tables from open-web sources (Websets)

Visit Exa →

Use Pipeworx if

The answer lives in a known authoritative system — filings, statistics, registries, markets
You need citations that resolve to the primary record, not to a page that mentions it
You want grounded extraction with refusal semantics instead of raw results
You want to subscribe to data events rather than re-searching

Get started — free

Connect Pipeworx in one line

Add this to your MCP client (Claude Desktop, Cursor, VS Code, Claude Code, etc.) — no API keys required for public data sources.

{
  "mcpServers": {
    "pipeworx": {
      "url": "https://gateway.pipeworx.io/mcp"
    }
  }
}

Common questions

When should an agent use Exa vs Pipeworx?

Decision rule: if a primary source exists for the question (a filing, a statistical series, a registry), use Pipeworx — you get the record itself, structured, with a verifiable citation. If the question is about open-web content — what people are saying, niche pages, long-tail research — use a web-search API like Exa. Agents that route this way hallucinate less and cite better.

Isn't web search strictly more general?

More general, less authoritative. A web search for "Apple FY2024 revenue" returns pages that paraphrase the 10-K, possibly outdated or wrong. Pipeworx returns the XBRL figure from the filing with a pipeworx:// citation to it. Generality is a feature for discovery; provenance is the feature for answers an agent will be held to.