One router. Every model. Zero wasted spend.

Burn sits between your app and any LLM. It classifies every prompt, picks the cheapest model that can handle it, and logs exactly what you spent. Cloud or local. Always optimal.

Get Burn on GitHub → View the Docs

$ burn route "summarize this email"
→ tier: SIMPLE
→ model: llama-3.3-70b ($0.10/1M)
→ saved: $0.47 vs GPT-4o
✓ response in 1.2s

// Under the hood

Three steps to zero waste

🔍

Classify

Burn reads your prompt and classifies it: SIMPLE, MEDIUM, COMPLEX, or CRITICAL. No guessing.

🔀

Route

Based on the tier, Burn picks the optimal model. Simple tasks go to cheap fast models. Complex reasoning goes to frontier models.

📊

Track

Every call is logged to SQLite. See exactly what you spent, what you saved, and where your budget is going.

// Features

Everything you need to stop overpaying

🧠

Automatic Prompt Classification

Four tiers: SIMPLE (fast/cheap), MEDIUM (balanced), COMPLEX (capable), CRITICAL (best available). Burn classifies in milliseconds.

🔀

Multi-Model Router

Route to any cloud model via OpenRouter, or run locally with Ollama and LM Studio. One interface, every model.

📊

Real-Time Cost Dashboard

Live savings banner. Actual spend. Request count. Routing distribution by model. Cost by task type. All from your SQLite DB.

🏠

Local Model Support

Route simple tasks to Ollama or LM Studio — completely free. Burn tracks local calls too so you see the full picture.

// What Burn knows about

Every tier. Every model.

Tier	Model	Provider	Cost	Best For
Budget	llama-3.3-70b	OpenRouter	$0.10/1M	Simple tasks, summaries
Standard	deepseek-v3	OpenRouter	$0.27/1M	Code, analysis
Standard	claude-haiku	Anthropic	$0.80/1M	Balanced tasks
Premium	claude-sonnet	Anthropic	$3/1M	Complex reasoning
Premium	gpt-4o	OpenAI	$5/1M	Multimodal tasks
Elite	claude-opus	Anthropic	$15/1M	Critical, highest stakes
Local	Ollama / LM Studio	Local	Free	Any task, private