@pipeworx/semantic-scholar
Connect: https://gateway.pipeworx.io/semantic-scholar/mcp · Install: one-click buttons
Tools: 3
Semantic Scholar (Allen Institute for AI) is a free academic search engine indexing 200M+ papers across all disciplines. It tracks citations, references, author profiles, and computes derived metrics (h-index, influential citations). Significantly broader than PubMed and more current than many discipline-specific databases.
Why this matters for AI agents
If your agent is researching anything academic — survey papers, citation networks, who’s working on what — Semantic Scholar is the canonical first stop. It handles cross-discipline queries gracefully and resolves DOIs, arXiv IDs, and PubMed IDs to the same paper record.
Common flows:
1. Search. “Recent papers on transformer architectures.” → search_papers({query: "transformer architectures"}) → top matches with titles, abstracts, citation counts.
2. Specific paper. “Tell me about this paper.” → get_paper({paper_id}) where paper_id can be Semantic Scholar ID, DOI, arXiv ID, PubMed ID, or URL:<url>.
3. Author profile. “What’s Yann LeCun working on?” → search authors, then list their papers.
Citable URI: pipeworx://semantic-scholar/paper/{paper_id} for stable references in agent output.
Auth
Free public API; rate-limited to ~1 request/second on the unauthenticated tier. For higher rates, request a free API key from https://www.semanticscholar.org/product/api. Pipeworx supports _apiKey passthrough.
ID resolution
get_paper accepts multiple identifier formats:
| Identifier | Example |
|---|---|
| Semantic Scholar Corpus ID | 649d8c8ce4f4b6c7e2a5e6d8... |
| DOI | 10.48550/arXiv.1706.03762 |
| arXiv ID | arXiv:1706.03762 |
| PubMed ID | PMID:12345678 |
| URL | URL:https://... |
Most agent flows pass DOIs.
Common pitfalls
- Citation counts vary by source. Semantic Scholar’s citation count differs from Google Scholar and Web of Science — different indexing methods. The “right” number depends on your audience.
- Influential citations. Semantic Scholar tags some citations as “influential” (the cited paper was actually used, vs. just listed). For “what really mattered to this paper” questions, filter to influential.
- Open-access preferences. Most Semantic Scholar records have an
openAccessPdffield — link to a free version. Use it in citations to avoid sending users behind paywalls. - Pre-prints vs published. arXiv pre-prints get separate records from their eventual published versions. Some have linked-version metadata; many don’t. For pre-2020 papers this is rarely an issue; for newer work in CS/biology it shows up often.
- Lag. Newly-published papers appear within a few weeks. Citations to those papers take longer to accumulate (citing papers must themselves be indexed).
- Some fields are sparse. Older papers, smaller-venue papers, and non-English papers have less metadata.
tldr.text(auto-generated summaries) only exists for some.
Tools
- search_papers — Search academic papers across all fields of science. Returns title, abstract, TL;DR summary, authors, citation count, year, journal, and open access PDF links. Example: search_papers(“transformer atte
- get_paper — Get full details for an academic paper by its Semantic Scholar paper ID, DOI, ArXiv ID, or other identifier. Returns title, abstract, TL;DR, authors, citations, references, open access link, and journ
- get_author — Get an academic author profile by Semantic Scholar author ID. Returns name, affiliations, h-index, total citations, paper count, and recent publications. Use search_papers first to find author IDs.
Tools
-
get_author— Get an academic author profile by Semantic Scholar author ID. Returns name, affiliations, h-index, total citations, paper count, and recent publications. Use search_papers first to find author IDs. -
get_paper— Get full details for an academic paper by its Semantic Scholar paper ID, DOI, ArXiv ID, or other identifier. Returns title, abstract, TL;DR, authors, citations, references, open access link, and journ -
search_papers— Search academic papers across all fields of science. Returns title, abstract, TL;DR summary, authors, citation count, year, journal, and open access PDF links. Example: search_papers( transformer atte