From AlphaFold to FDA, in one MCP gateway
We shipped 14 life-sciences data sources to the Pipeworx gateway this week — protein structures, bioactivity, pathways, trials, and regulatory data, all reachable through one MCP endpoint.
A pharma research workflow has always crossed a lot of databases. To go from a target protein to an approved drug — or just to brief an agent on what’s known about a target — you typically need answers from a dozen different APIs: UniProt for sequence and function, AlphaFold for predicted structure, RCSB PDB for experimental structures, ChEMBL for bioactivity, Open Targets for target-disease association, ClinicalTrials.gov for active trials, OpenFDA for adverse events and labels, PubMed for primary literature.
Each has its own auth model, response schema, rate limits, and idiosyncratic identifier system. Wiring an AI agent into all of them used to mean writing a dozen integration shims.
Over the last 48 hours we shipped 14 life-sciences packs to the Pipeworx gateway. Combined with what was already there (OpenFDA, ClinicalTrials.gov, RxNorm, PubMed, DailyMed), the gateway now spans the full target → drug → trial → regulatory → literature arc through a single MCP endpoint.
What landed
| Pack | Source | What it answers |
|---|---|---|
uniprot | UniProt | Protein sequence, function, GO terms, cross-references |
alphafold | DeepMind AlphaFold DB | Predicted 3D structure for any UniProt entry |
rcsb-pdb | RCSB Protein Data Bank | Experimental structures, ligand binding sites |
chembl | EMBL-EBI ChEMBL | Bioactivity data — IC50s, Ki values, targets per compound |
opentargets | Open Targets Platform | Target-disease association evidence (GWAS, literature, drugs) |
reactome | Reactome | Curated biological pathways (signaling, metabolism, disease) |
biorxiv | bioRxiv preprint server | Recent life-sciences preprints, structured metadata |
europepmc | Europe PMC | Full-text biomedical literature (PMC + bioRxiv + others) |
dailymed | NLM DailyMed | Current FDA-approved drug labels with structured indications |
medlineplus | NLM MedlinePlus | Plain-language consumer health information |
usda-fdc | USDA FoodData Central | Food composition, nutrient profiles |
npi-registry | NPPES | National Provider Identifier registry |
orcid | ORCID | Researcher identifiers and publication records |
ror | ROR | Research Organization Registry |
Plus the packs already in the gateway: openfda (drug events, labels, recalls, devices), clinicaltrials (trial registry), pubmed (NLM literature), rxnorm (drug naming standardization).
What the chain looks like
Here’s a query that wouldn’t have been one MCP call last week: “Brief me on SGLT2 as a drug target.”
What an agent does with the new packs available:
resolve_entity— disambiguate “SGLT2” to its UniProt accession (P31639) and known synonyms.uniprot— protein function (sodium-glucose co-transporter, kidney/intestine), GO terms, domain organization.alphafold— predicted 3D structure for the full-length protein.rcsb-pdb— experimental structures, including co-crystals with known inhibitors.chembl— bioactivity data for compounds targeting SGLT2: IC50s, structure-activity series.opentargets— disease associations (type 2 diabetes, chronic kidney disease, heart failure) with evidence weighting.reactome— glucose transport and reabsorption pathways the protein participates in.clinicaltrials— active and completed trials targeting SGLT2.openfda— approved SGLT2 inhibitors (empagliflozin, dapagliflozin, canagliflozin) and post-market adverse event profiles.europepmc— recent literature surveying clinical outcomes and mechanism.
Ten data sources. Ten different authentication models, schemas, and identifier systems if you wired them yourself. One MCP endpoint with the new packs in place.
Or, equivalently: the agent calls ask_pipeworx("Brief me on SGLT2 as a drug target") once, the gateway routes through its meta-tool layer, and the user gets the synthesized brief without writing the chain manually.
Why now
We’ve been hearing one version of the same request repeatedly: builders working on biotech/pharma agents who hit the wall not because the agents aren’t capable, but because the data plumbing is. Every new source adds a week of integration work — auth flows, schema parsing, error handling, rate-limit accommodation — and the marginal week per source is what kills these projects.
The Pipeworx thesis is that the integration tax is the bottleneck in agent development, and that consolidating live data behind one MCP gateway is the cheapest way to dissolve it. The life-sciences vertical is where that thesis pays off most cleanly right now: the source list is well-defined, the data is mostly public, and the workflows are stable enough that a compound tool over the right packs is meaningfully more useful than the sum of its parts.
What’s coming next in this vertical:
- A
pharma_target_profilecompound tool that does the fan-out above in one call - More structural-biology sources (BMRB for NMR, EMDB for cryo-EM)
- Connection to the existing
validate_claimmeta-tool for drug claims against DailyMed labels
If you’re building agents that need any of this, the gateway is at gateway.pipeworx.io/mcp. Free anonymous tier, no signup. The new packs are reachable directly (e.g. gateway.pipeworx.io/alphafold/mcp) or through ask_pipeworx for natural-language routing across the full catalog.