@pipeworx/semantic-scholar

Connect: https://gateway.pipeworx.io/semantic-scholar/mcp · Install: one-click buttons

Tools: 3

Semantic Scholar (Allen Institute for AI) is a free academic search engine indexing 200M+ papers across all disciplines. It tracks citations, references, author profiles, and computes derived metrics (h-index, influential citations). Significantly broader than PubMed and more current than many discipline-specific databases.

Why this matters for AI agents

If your agent is researching anything academic — survey papers, citation networks, who’s working on what — Semantic Scholar is the canonical first stop. It handles cross-discipline queries gracefully and resolves DOIs, arXiv IDs, and PubMed IDs to the same paper record.

Common flows:

1. Search. “Recent papers on transformer architectures.” → search_papers({query: "transformer architectures"}) → top matches with titles, abstracts, citation counts.

2. Specific paper. “Tell me about this paper.” → get_paper({paper_id}) where paper_id can be Semantic Scholar ID, DOI, arXiv ID, PubMed ID, or URL:<url>.

3. Author profile. “What’s Yann LeCun working on?” → search authors, then list their papers.

Citable URI: pipeworx://semantic-scholar/paper/{paper_id} for stable references in agent output.

Auth

Free public API; rate-limited to ~1 request/second on the unauthenticated tier. For higher rates, request a free API key from https://www.semanticscholar.org/product/api. Pipeworx supports _apiKey passthrough.

ID resolution

get_paper accepts multiple identifier formats:

IdentifierExample
Semantic Scholar Corpus ID649d8c8ce4f4b6c7e2a5e6d8...
DOI10.48550/arXiv.1706.03762
arXiv IDarXiv:1706.03762
PubMed IDPMID:12345678
URLURL:https://...

Most agent flows pass DOIs.

Common pitfalls

  • Citation counts vary by source. Semantic Scholar’s citation count differs from Google Scholar and Web of Science — different indexing methods. The “right” number depends on your audience.
  • Influential citations. Semantic Scholar tags some citations as “influential” (the cited paper was actually used, vs. just listed). For “what really mattered to this paper” questions, filter to influential.
  • Open-access preferences. Most Semantic Scholar records have an openAccessPdf field — link to a free version. Use it in citations to avoid sending users behind paywalls.
  • Pre-prints vs published. arXiv pre-prints get separate records from their eventual published versions. Some have linked-version metadata; many don’t. For pre-2020 papers this is rarely an issue; for newer work in CS/biology it shows up often.
  • Lag. Newly-published papers appear within a few weeks. Citations to those papers take longer to accumulate (citing papers must themselves be indexed).
  • Some fields are sparse. Older papers, smaller-venue papers, and non-English papers have less metadata. tldr.text (auto-generated summaries) only exists for some.

Tools

  • search_papers — Search academic papers across all fields of science. Returns title, abstract, TL;DR summary, authors, citation count, year, journal, and open access PDF links. Example: search_papers(“transformer atte
  • get_paper — Get full details for an academic paper by its Semantic Scholar paper ID, DOI, ArXiv ID, or other identifier. Returns title, abstract, TL;DR, authors, citations, references, open access link, and journ
  • get_author — Get an academic author profile by Semantic Scholar author ID. Returns name, affiliations, h-index, total citations, paper count, and recent publications. Use search_papers first to find author IDs.

Tools

  • get_author — Get an academic author profile by Semantic Scholar author ID. Returns name, affiliations, h-index, total citations, paper count, and recent publications. Use search_papers first to find author IDs.
  • get_paper — Get full details for an academic paper by its Semantic Scholar paper ID, DOI, ArXiv ID, or other identifier. Returns title, abstract, TL;DR, authors, citations, references, open access link, and journ
  • search_papers — Search academic papers across all fields of science. Returns title, abstract, TL;DR summary, authors, citation count, year, journal, and open access PDF links. Example: search_papers( transformer atte

Regenerated from source · build May 9, 2026