← Blog

From AlphaFold to FDA, in one MCP gateway

We shipped 14 life-sciences data sources to the Pipeworx gateway this week — protein structures, bioactivity, pathways, trials, and regulatory data, all reachable through one MCP endpoint.

A pharma research workflow has always crossed a lot of databases. To go from a target protein to an approved drug — or just to brief an agent on what’s known about a target — you typically need answers from a dozen different APIs: UniProt for sequence and function, AlphaFold for predicted structure, RCSB PDB for experimental structures, ChEMBL for bioactivity, Open Targets for target-disease association, ClinicalTrials.gov for active trials, OpenFDA for adverse events and labels, PubMed for primary literature.

Each has its own auth model, response schema, rate limits, and idiosyncratic identifier system. Wiring an AI agent into all of them used to mean writing a dozen integration shims.

Over the last 48 hours we shipped 14 life-sciences packs to the Pipeworx gateway. Combined with what was already there (OpenFDA, ClinicalTrials.gov, RxNorm, PubMed, DailyMed), the gateway now spans the full target → drug → trial → regulatory → literature arc through a single MCP endpoint.

What landed

PackSourceWhat it answers
uniprotUniProtProtein sequence, function, GO terms, cross-references
alphafoldDeepMind AlphaFold DBPredicted 3D structure for any UniProt entry
rcsb-pdbRCSB Protein Data BankExperimental structures, ligand binding sites
chemblEMBL-EBI ChEMBLBioactivity data — IC50s, Ki values, targets per compound
opentargetsOpen Targets PlatformTarget-disease association evidence (GWAS, literature, drugs)
reactomeReactomeCurated biological pathways (signaling, metabolism, disease)
biorxivbioRxiv preprint serverRecent life-sciences preprints, structured metadata
europepmcEurope PMCFull-text biomedical literature (PMC + bioRxiv + others)
dailymedNLM DailyMedCurrent FDA-approved drug labels with structured indications
medlineplusNLM MedlinePlusPlain-language consumer health information
usda-fdcUSDA FoodData CentralFood composition, nutrient profiles
npi-registryNPPESNational Provider Identifier registry
orcidORCIDResearcher identifiers and publication records
rorRORResearch Organization Registry

Plus the packs already in the gateway: openfda (drug events, labels, recalls, devices), clinicaltrials (trial registry), pubmed (NLM literature), rxnorm (drug naming standardization).

What the chain looks like

Here’s a query that wouldn’t have been one MCP call last week: “Brief me on SGLT2 as a drug target.”

What an agent does with the new packs available:

  1. resolve_entity — disambiguate “SGLT2” to its UniProt accession (P31639) and known synonyms.
  2. uniprot — protein function (sodium-glucose co-transporter, kidney/intestine), GO terms, domain organization.
  3. alphafold — predicted 3D structure for the full-length protein.
  4. rcsb-pdb — experimental structures, including co-crystals with known inhibitors.
  5. chembl — bioactivity data for compounds targeting SGLT2: IC50s, structure-activity series.
  6. opentargets — disease associations (type 2 diabetes, chronic kidney disease, heart failure) with evidence weighting.
  7. reactome — glucose transport and reabsorption pathways the protein participates in.
  8. clinicaltrials — active and completed trials targeting SGLT2.
  9. openfda — approved SGLT2 inhibitors (empagliflozin, dapagliflozin, canagliflozin) and post-market adverse event profiles.
  10. europepmc — recent literature surveying clinical outcomes and mechanism.

Ten data sources. Ten different authentication models, schemas, and identifier systems if you wired them yourself. One MCP endpoint with the new packs in place.

Or, equivalently: the agent calls ask_pipeworx("Brief me on SGLT2 as a drug target") once, the gateway routes through its meta-tool layer, and the user gets the synthesized brief without writing the chain manually.

Why now

We’ve been hearing one version of the same request repeatedly: builders working on biotech/pharma agents who hit the wall not because the agents aren’t capable, but because the data plumbing is. Every new source adds a week of integration work — auth flows, schema parsing, error handling, rate-limit accommodation — and the marginal week per source is what kills these projects.

The Pipeworx thesis is that the integration tax is the bottleneck in agent development, and that consolidating live data behind one MCP gateway is the cheapest way to dissolve it. The life-sciences vertical is where that thesis pays off most cleanly right now: the source list is well-defined, the data is mostly public, and the workflows are stable enough that a compound tool over the right packs is meaningfully more useful than the sum of its parts.

What’s coming next in this vertical:

  • A pharma_target_profile compound tool that does the fan-out above in one call
  • More structural-biology sources (BMRB for NMR, EMDB for cryo-EM)
  • Connection to the existing validate_claim meta-tool for drug claims against DailyMed labels

If you’re building agents that need any of this, the gateway is at gateway.pipeworx.io/mcp. Free anonymous tier, no signup. The new packs are reachable directly (e.g. gateway.pipeworx.io/alphafold/mcp) or through ask_pipeworx for natural-language routing across the full catalog.