Route LLM traffic smartly by cost, latency & reliability

LLMux is a fully managed cloud API. Use our SDK and we’ll pick the best provider in real time, enforce budgets, surface usage, and give you SLAs — no servers required.

Get Started Explore Features

Cloud-hosted · SDKs for JS/TS, Python, Go · Works with OpenAI & Anthropic

SDK example (TypeScript)

import { LLMux } from "@llmux/sdk";
const llm = new LLMux({ apiKey: "sk-…" });
const out = await llm.chat([{ role: "user", content: "Hello" }]);

Latency

p50/p95 tracked

Cost

per 1K tokens

Health

active ping

Features

Everything you need from one cloud API—no infra, no queues, no cronjobs.

Intelligent Policy Chain

Live scoring by latency, failure ratio & cost; budget caps; fairness routing.

Usage & Limits

Plan-based quotas, per-key rate limits, and simple dashboards.

Security

Key scoping, short-lived logs, region pinning, and BYOK.

Providers

OpenAI & Anthropic out of the box; optional connectors for local endpoints.

Observability

Per-provider cost/latency metrics and request logs (PII-safe).

SLA

Premium plans include uptime SLAs and support.

How it Works

Score

Measure p50/p95 latency, failure ratio & estimated cost per 1K tokens for each provider.
Cap

Enforce per-request and per-plan budgets before routing.
Route

Round-robin across healthy, in-budget providers with transparent fallback.

Pricing

Start free. Pay as you grow.

Free

· 50K routed tokens/mo
· 1 project · 2 keys
· Community support

Choose

Pro

$29/mo

· 2M routed tokens/mo
· 10 projects · 25 keys
· Basic dashboards
· Email support

Choose

Team

$149/mo

· SSO & role-based keys
· Unlimited projects
· Priority support & SLA

Contact Sales

* Provider usage is pass-through to your account when BYOK is enabled.

Get Started

1) Create an API key

Sign up → create a project
Generate a server key
Optionally set monthly budget & region

2) Install SDK

// JavaScript/TypeScript
import { LLMux } from "@llmux/sdk";
const llm = new LLMux({ apiKey: process.env.LLMUX_KEY });
const out = await llm.chat([{ role: "user", content: "Hello" }]);

# Python
from llmux import LLMux
llm = LLMux(api_key=os.getenv("LLMUX_KEY"))
out = llm.chat([{"role":"user","content":"Hello"}])

Endpoints

POST https://api.llmux.app/v1/chat – chat routed to best provider
GET https://api.llmux.app/v1/usage – plan & usage summary
GET https://api.llmux.app/v1/providers – health & pricing snapshot
GET https://api.llmux.app/v1/keys – list & scope API keys

FAQ

Do I need to run servers?

No. LLMux is fully managed. Install the SDK and call our cloud API.

What about data privacy?

We don’t retain prompts or outputs by default. Region pinning and short-lived logs are available on paid plans.

Can I bring my own provider keys (BYOK)?

Yes. You may route through your own provider accounts while still using LLMux policies and metrics.

Is there a CLI?

Planned for key management and usage summaries. Most users only need the SDK.