HTML to Text

live Utility

Strip HTML to readable plain text, extract links, and pull page metadata (title, description, Open Graph). Regex-based, keyless, offline. Great for cleaning scraped HTML.

3 tools

0ms auth

free tier 50 calls/day

Tools

html_to_text

Convert HTML to readable plain text (keyless, offline): drops scripts/styles/comments, converts block elements to newlines, decodes entities, and collapses whitespace. Ideal for cleaning scraped HTML.

No parameters required.

▶ Try it

Response

extract_links

Extract all hyperlinks from HTML as {href, text} pairs (keyless, offline).

No parameters required.

▶ Try it

Response

extract_metadata

Pull page metadata from HTML (keyless, offline): <title>, meta description/keywords, and Open Graph (og:*) tags.

No parameters required.

▶ Try it

Response

Test with curl

The gateway speaks JSON-RPC 2.0 over HTTP POST. You can test any pack directly from the terminal.

List available tools

bash

curl -X POST https://gateway.pipeworx.io/htmltext/mcp \
  -H "Content-Type: application/json" \
  -d '{"jsonrpc":"2.0","id":1,"method":"tools/list"}'

Call a tool

bash

curl -X POST https://gateway.pipeworx.io/htmltext/mcp \
  -H "Content-Type: application/json" \
  -d '{"jsonrpc":"2.0","id":2,"method":"tools/call","params":{"name":"html_to_text","arguments":{}}}'

Use with the SDK

Install @pipeworx/sdk to call tools from any TypeScript/Node project.

TypeScript

import { Pipeworx } from '@pipeworx/sdk';
const px = new Pipeworx();
const result = await px.call("html_to_text", {});

ask_pipeworx

// Or ask in plain English:
const answer = await px.ask("strip html to readable plain text, extract links, and pull page metadata (title, description, open graph)");

MCP Config

claude_desktop_config.json

{
  "mcpServers": {
    "pipeworx-htmltext": {
      "command": "npx",
      "args": [
        "-y",
        "mcp-remote",
        "https://gateway.pipeworx.io/htmltext/mcp"
      ]
    }
  }
}