Λ LLMux

Route LLM traffic smartly by cost, latency & reliability

LLMux is a fully managed cloud API. Use our SDK and we’ll pick the best provider in real time, enforce budgets, surface usage, and give you SLAs — no servers required.

Cloud-hosted · SDKs for JS/TS, Python, Go · Works with OpenAI & Anthropic

SDK example (TypeScript)
import { LLMux } from "@llmux/sdk";
const llm = new LLMux({ apiKey: "sk-…" });
const out = await llm.chat([{ role: "user", content: "Hello" }]);
Latency
p50/p95 tracked
Cost
per 1K tokens
Health
active ping
Managed SaaS · Zero ops Global edge routing API keys & rate plans Privacy controls & region pinning

Features

Everything you need from one cloud API—no infra, no queues, no cronjobs.

Intelligent Policy Chain

Live scoring by latency, failure ratio & cost; budget caps; fairness routing.

Usage & Limits

Plan-based quotas, per-key rate limits, and simple dashboards.

Security

Key scoping, short-lived logs, region pinning, and BYOK.

Providers

OpenAI & Anthropic out of the box; optional connectors for local endpoints.

Observability

Per-provider cost/latency metrics and request logs (PII-safe).

SLA

Premium plans include uptime SLAs and support.

How it Works

  1. Score

    Measure p50/p95 latency, failure ratio & estimated cost per 1K tokens for each provider.

  2. Cap

    Enforce per-request and per-plan budgets before routing.

  3. Route

    Round-robin across healthy, in-budget providers with transparent fallback.

Pricing

Start free. Pay as you grow.

Free

$0

  • · 50K routed tokens/mo
  • · 1 project · 2 keys
  • · Community support
Choose

Pro

$29/mo

  • · 2M routed tokens/mo
  • · 10 projects · 25 keys
  • · Basic dashboards
  • · Email support
Choose

Team

$149/mo

  • · SSO & role-based keys
  • · Unlimited projects
  • · Priority support & SLA
Contact Sales

* Provider usage is pass-through to your account when BYOK is enabled.

Get Started

1) Create an API key

  1. Sign up → create a project
  2. Generate a server key
  3. Optionally set monthly budget & region

2) Install SDK

// JavaScript/TypeScript
import { LLMux } from "@llmux/sdk";
const llm = new LLMux({ apiKey: process.env.LLMUX_KEY });
const out = await llm.chat([{ role: "user", content: "Hello" }]);
# Python
from llmux import LLMux
llm = LLMux(api_key=os.getenv("LLMUX_KEY"))
out = llm.chat([{"role":"user","content":"Hello"}])

Endpoints

  • POST https://api.llmux.app/v1/chat – chat routed to best provider
  • GET https://api.llmux.app/v1/usage – plan & usage summary
  • GET https://api.llmux.app/v1/providers – health & pricing snapshot
  • GET https://api.llmux.app/v1/keys – list & scope API keys

FAQ

Do I need to run servers?

No. LLMux is fully managed. Install the SDK and call our cloud API.

What about data privacy?

We don’t retain prompts or outputs by default. Region pinning and short-lived logs are available on paid plans.

Can I bring my own provider keys (BYOK)?

Yes. You may route through your own provider accounts while still using LLMux policies and metrics.

Is there a CLI?

Planned for key management and usage summaries. Most users only need the SDK.