Pipeworx vs Exa
structured primary-source records vs neural web search for AI
structured, citable records from 877 authoritative sources — the filing itself, not pages about it.
neural web search built for AI — finding and extracting from web pages at scale, plus Websets for structured web research.
Exa is the strongest of the search-APIs-for-AI: neural retrieval over the web, content extraction, and Websets for turning web research into structured tables. Pipeworx starts one layer deeper: instead of searching pages ABOUT a thing, it returns the authoritative record OF the thing — the SEC filing, the FRED observation, the FDA entry — as structured JSON with provenance metadata and a stable pipeworx:// citation. Web search is the right tool when the answer lives on arbitrary web pages; a data gateway is the right tool when the answer lives in a known authoritative system. Most capable agents need both — and the failure mode we exist to prevent is reaching for web search when a primary source exists.
Side-by-side
| Pipeworx | Exa | |
|---|---|---|
| Unit of retrieval | The authoritative record — structured JSON from the source system | Web pages/passages — neural search + extraction |
| Provenance | _meta.source + fetched_at + stable pipeworx:// citation URI on every response | Result URLs; no provenance/citation scheme |
| Grounded answering | ask_pipeworx_grounded — extractive answer + verbatim evidence + explicit refusal when the data doesn't say | Returns search results; answering is your model's job |
| Structured output | Native — every tool returns typed JSON with outputSchema | Websets builds structured tables from web research |
| Monitoring | Data-event subscriptions (8-Ks, FRED, markets, patents, trials) with push delivery | Not an event-subscription product |
| Interface | MCP gateway (one URL, NL router) + REST | API + MCP servers for search/Websets |
When to use which
Use Exa if
- The answer lives on arbitrary web pages — news commentary, company sites, long-tail content
- You need web-scale retrieval and extraction as a building block
- You're assembling research tables from open-web sources (Websets)
Use Pipeworx if
- The answer lives in a known authoritative system — filings, statistics, registries, markets
- You need citations that resolve to the primary record, not to a page that mentions it
- You want grounded extraction with refusal semantics instead of raw results
- You want to subscribe to data events rather than re-searching
Connect Pipeworx in one line
Add this to your MCP client (Claude Desktop, Cursor, VS Code, Claude Code, etc.) — no API keys required for public data sources.
{
"mcpServers": {
"pipeworx": {
"url": "https://gateway.pipeworx.io/mcp"
}
}
} Common questions
When should an agent use Exa vs Pipeworx?
Decision rule: if a primary source exists for the question (a filing, a statistical series, a registry), use Pipeworx — you get the record itself, structured, with a verifiable citation. If the question is about open-web content — what people are saying, niche pages, long-tail research — use a web-search API like Exa. Agents that route this way hallucinate less and cite better.
Isn't web search strictly more general?
More general, less authoritative. A web search for "Apple FY2024 revenue" returns pages that paraphrase the 10-K, possibly outdated or wrong. Pipeworx returns the XBRL figure from the filing with a pipeworx:// citation to it. Generality is a feature for discovery; provenance is the feature for answers an agent will be held to.