Now available · v3.0 with enterprise features

Stop paying GPT-4 prices
for GPT-4o-mini tasks.

ModelSpend routes every AI prompt to the cheapest capable model automatically — saving 40–70% on AI costs with a single environment variable change. No rewrites.

✓ No credit card required · ✓ OpenAI SDK compatible · ✓ SOC 2 in progress · ✓ Self-hostable · ✓ First call in 4 minutes
The only change you need to make
# Before — paying GPT-4 prices for everything from openai import OpenAI client = OpenAI(api_key="sk-...") # After — routes to cheapest capable model from modelspend import ModelSpend client = ModelSpend(api_key="msp_live_...") # Everything else stays identical ✓ response = client.chat.completions.create( model="gpt-4o", # used as routing hint only messages=[{"role": "user", "content": prompt}] )
Avg saving
62%
$2.4M+
AI costs saved
140M+
Prompts routed
62%
Average cost reduction
12
Providers supported
How it works

Route once. Save forever.

ModelSpend analyses each prompt's complexity and routes it to the cheapest model that meets your quality requirements.

01

Proxy intercepts

Point your OpenAI SDK at our endpoint. Zero other changes.

02

Complexity scores

Analyse: Is this a simple lookup or a complex reasoning task?

03
🛡

Policy checks

Budget caps, DLP scanning, governance rules, approval workflows.

04

Route to cheapest

Execute on the best-value model. Log cost, latency, quality.

Intelligent tier routing

Not all tasks need the same model. ModelSpend routes each to the right tier automatically.

Cheap Read
Lookup, extraction, classification, summarisation
Claude Haiku · Gemini Flash · Groq Llama
~85% cheaper
Balanced Build
Code generation, analysis, multi-step reasoning
GPT-4o · Claude Sonnet · Gemini Pro
~40% cheaper
Deep Critical
Complex research, legal review, board-level decisions
o3 · Claude Opus · GPT-4.5
Full quality
Everything included

Built for engineering teams.
Trusted by enterprise.

ZERO CODE CHANGES

Drop-in proxy

OpenAI-compatible endpoint. Change one env var. Your existing code works immediately.

📊

Real-time cost analytics

USD cost per call, per model, per team member. Not estimates — actual invoice-matching figures.

NEW
🔬

Evaluation framework

Upload a test dataset, run it against multiple models, score quality with LLM-as-judge. Know before you switch.

NEW
📦

Prompt registry

Semantic versioning for system prompts. Draft → staging → production promotion workflow with rollback.

NEW
🔍

OpenTelemetry traces

Every execute call emits spans: routing, DLP, budget, bridge execution. Export to Jaeger, Datadog, New Relic.

ENTERPRISE
🏢

Enterprise governance

SSO, SCIM provisioning, DLP scanning, model access rules, approval workflows, policy-as-code.

🔒

Budget cascade

Hard limits at company → dept → team → user → session → prompt. 6-level cascade enforcement.

🌍

Local models

Ollama, vLLM, or any OpenAI-compatible local server. Route non-sensitive tasks at zero marginal cost.

📋

Audit & compliance

Immutable audit log. CEF/JSONL export to Splunk, Elastic, Datadog. Configurable 90-day retention.

Integrations

Works with everything
your team already uses.

Native SDKs for Python and Node.js. VS Code extension. GitHub Action. Slack slash commands. Zapier integration. MCP servers for ChatGPT, Claude, and Gemini.

Python SDKNode.js SDKREST APIVS CodeGitHub ActionSlackZapierClaude MCPChatGPT MCP
Node.js · 1 line change
// Before import OpenAI from 'openai'; const client = new OpenAI({ apiKey: key }); // After — routes to cheapest model import ModelSpend from '@modelspend/sdk'; const client = new ModelSpend({ apiKey: csk }); // Streaming, tools, vision — all work const stream = await client.chat.completions .create({ model: 'gpt-4o', stream: true, ... });

Simple pricing.
Scales with your savings.

The cost of ModelSpend is typically 1–3% of your AI savings. Free tier included.

Starter
$19 /month

For individuals and small teams exploring AI cost optimisation.

  • Up to 1,000 executions/month
  • All providers + local Ollama
  • Basic analytics dashboard
  • API access
  • Email support
Start free trial
Most popular
Growth
$79 /month

For growing teams that need governance and deeper cost control.

  • 25,000 AI executions/month
  • Team budgets & governance
  • Evaluation framework
  • Prompt registry + OTel traces
  • Slack alerts + approvals
  • Priority support
Start free trial
Enterprise
$299

For regulated organisations that need security, compliance and scale.

  • Unlimited executions
  • SSO + SCIM provisioning
  • DLP + governance rules
  • OpenTelemetry traces
  • Policy as code
  • SOC 2 docs + BAA
  • Dedicated CSM
Talk to sales

Your AI bill is
too high. Fix it today.

Average customer saves 62% in the first month. Setup takes 4 minutes. No infrastructure changes. No rewrites. Just better routing.

Start saving in 4 minutes → Calculate your savings