Extract

Fetch one or more URLs and return clean, reader-friendly content as markdown.

POST/api/v1/extract

Extraction (a.k.a. reader) turns a live web page into clean, boilerplate-free content ready to feed an LLM. Back it with providers like Jina Reader or Exa Contents - same request, same normalized response.

Request

ParameterTypeRequiredDescription
modelstringrequiredExtract model slug, e.g. jina/reader or exa/contents.
urlsstring[]requiredOne or more URLs to fetch and extract.
json
{
  "model": "jina/reader",
  "urls": ["https://example.com/article", "https://arxiv.org/abs/2310.11511"]
}

Response

FieldTypeRequiredDescription
resultsobject[]optionalOne result per URL (see below).
usageobjectoptional{ requests, cost } - cost in USD.

Result object

FieldTypeRequiredDescription
urlstringoptionalThe fetched URL.
titlestringoptionalExtracted page title.
contentstringoptionalClean page content as markdown.
rawobjectoptionalOriginal provider payload, for debugging.
json
{
  "results": [
    {
      "url": "https://example.com/article",
      "title": "An Example Article",
      "content": "# An Example Article\n\nClean markdown body...",
      "raw": { }
    }
  ],
  "usage": { "requests": 1, "cost": 0.001 }
}
Pairs with search. Run /search to find URLs, then /extract to pull full content - or set include_content: true on a search request to do both in one call where the provider supports it.

Examples

cURL

bash
curl https://searchrouter.ai/api/v1/extract \
  -H "Authorization: Bearer $SR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "jina/reader",
    "urls": ["https://example.com/article"]
  }'

Python

python
import os, requests

resp = requests.post(
    "https://searchrouter.ai/api/v1/extract",
    headers={"Authorization": f"Bearer {os.environ['SR_API_KEY']}"},
    json={
        "model": "jina/reader",
        "urls": ["https://example.com/article"],
    },
)
resp.raise_for_status()
for r in resp.json()["results"]:
    print(r["title"])
    print(r["content"][:500])

JavaScript

javascript
const resp = await fetch("https://searchrouter.ai/api/v1/extract", {
  method: "POST",
  headers: {
    Authorization: `Bearer ${process.env.SR_API_KEY}`,
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    model: "jina/reader",
    urls: ["https://example.com/article"],
  }),
});
const { results } = await resp.json();
results.forEach((r) => console.log(r.title, r.content.slice(0, 500)));