HTML to Text
live UtilityStrip HTML to readable plain text, extract links, and pull page metadata (title, description, Open Graph). Regex-based, keyless, offline. Great for cleaning scraped HTML.
Tools
html_to_text Convert HTML to readable plain text (keyless, offline): drops scripts/styles/comments, converts block elements to newlines, decodes entities, and collapses whitespace. Ideal for cleaning scraped HTML.
No parameters required.
Try it
extract_links Extract all hyperlinks from HTML as {href, text} pairs (keyless, offline).
No parameters required.
Try it
extract_metadata Pull page metadata from HTML (keyless, offline): <title>, meta description/keywords, and Open Graph (og:*) tags.
No parameters required.
Try it
Test with curl
The gateway speaks JSON-RPC 2.0 over HTTP POST. You can test any pack directly from the terminal.
curl -X POST https://gateway.pipeworx.io/htmltext/mcp \
-H "Content-Type: application/json" \
-d '{"jsonrpc":"2.0","id":1,"method":"tools/list"}' curl -X POST https://gateway.pipeworx.io/htmltext/mcp \
-H "Content-Type: application/json" \
-d '{"jsonrpc":"2.0","id":2,"method":"tools/call","params":{"name":"html_to_text","arguments":{}}}' Use with the SDK
Install @pipeworx/sdk to call tools from any TypeScript/Node project.
import { Pipeworx } from '@pipeworx/sdk';
const px = new Pipeworx();
const result = await px.call("html_to_text", {}); // Or ask in plain English:
const answer = await px.ask("strip html to readable plain text, extract links, and pull page metadata (title, description, open graph)");