Connectors

SearchRouter is a normalization proxy. Connectors - also called adapters - translate the canonical schema to each provider's native API and back.

How it works

SearchRouter owns no inference. For every request it (1) accepts one canonical schema, (2) picks a provider via routing, (3) translates the canonical request to that provider's native API, (4) calls the provider with managed or BYOK credentials, (5) translates the response back to canonical, and (6) meters usage and returns the result.

text

canonical request ──▶ ROUTER ──▶ ADAPTER ──▶ provider native API
                                                       │
canonical response ◀── ADAPTER ◀── provider response ◀─┘

The three adapter tiers

Connectors fall into three tiers by how much translation the provider requires. The easier the mapping, the cheaper the connector is to build and maintain.

1. Pass-through

The provider already speaks our canonical schema, so we forward the request almost verbatim - just rewriting the model id and injecting our key. OpenAI embeddings already match our /embeddings shape, for example.

2. Property-mapping

The provider has a custom schema, so we map field names and value vocabularies in both directions. Examples in our domain:

Exa search: our query / num_results → Exa query / numResults; Exa response text → our content.
Cohere embed: our input_type: "query" → Cohere input_type: "search_query".
Brave: auth header Authorization: Bearer → X-Subscription-Token.

3. Special-case

The provider is structurally divergent and the adapter rebuilds the canonical shape. mixedbread/rerank returns data[] with a score field instead of results[] with relevance_score; its adapter remaps both.

Tier	When	Example
Pass-through	Provider matches the canonical schema	`openai/text-embedding-3-small`
Property-map	Custom field names / vocabularies	`exa/neural`
Special-case	Structurally divergent response	`mixedbread/rerank`

Authentication styles

Auth is the only real per-provider variation for credentials. Each adapter declares an auth_style, and the router attaches the upstream key the right way:

Style	Providers	How
`Authorization: Bearer`	OpenAI, Cohere, Voyage, Jina, Tavily, Perplexity, Linkup, You.com	Bearer token header
`x-api-key`	Exa, Valyu	Header
`X-API-KEY`	Serper	Header (case-sensitive)
`X-Subscription-Token`	Brave	Header
`query param`	SerpAPI, Google CSE	`?api_key=` / `?key=`

ⓘWhere keys live. Upstream keys are stored server-side in managed config (our own keys) or, for BYOK, AES-encrypted per organization and decrypted at call time.

The adapter interface

Each adapter is a small class implementing only the category methods the provider supports. A single shared async HTTP client issues the upstream call:

python

class Adapter(ABC):
    provider: str                 # "exa"
    categories: set[str]          # {"search", "answer", "extract"}
    base_url: str
    auth_style: AuthStyle         # BEARER | X_API_KEY | X_SUBSCRIPTION_TOKEN | QUERY

    async def search(self, req: SearchRequest, model: str, key: str) -> SearchResponse: ...
    async def embed(self, req: EmbedRequest, model: str, key: str) -> EmbedResponse: ...
    async def rerank(self, req: RerankRequest, model: str, key: str) -> RerankResponse: ...
    # only the methods for the categories a provider supports are implemented

A registry maps every model slug → (adapter, native model name, category, pricing), and the router resolves the slug (or auto / fallback list), checks provider health, and calls the right method, retrying the next provider on failure.

Adding a provider

Write an adapter class implementing the supported category methods.
Add its models to the registry with slug, category, native model name, pricing (USD strings), feature flags, and dimensions / max results.
Add the upstream key to managed config, or let orgs supply it via BYOK.
The model immediately appears in GET /api/v1/models, the marketplace UI, and routing.

Metering & billing

Every request runs through the same metered pipeline:

text

auth(key) → resolve org → check balance & key spend limit
          → route → adapter call (timed) → normalize response
          → compute cost from registry pricing
          → write UsageEvent + decrement org credits
          → return canonical response (+ X-SR-* cost/latency headers)

Cost is computed from registry pricing, written as a UsageEvent, and decremented from the org credit balance. See Pricing for the billing model.

← Previous

Routing & Fallback

Pricing