All features

Everything you need to master
AI costs.

From a single-line proxy swap to enterprise governance and distributed tracing — ModelSpend grows with your team.

Core routing

OpenAI-compatible proxy

Point your existing OpenAI or Anthropic SDK at api.modelspend.best/proxy/v1. Every existing SDK call — streaming, tools, vision, function calling — works identically. The model parameter becomes a routing hint.

Streaming SSE with OpenAI delta format
Function calling and tool use pass-through
Cost metadata in x-modelspend-* headers
Model-to-tier hint table (50+ model patterns)

 # Environment variable only
OPENAI_BASE_URL=https://api.modelspend.best/proxy/v1
OPENAI_API_KEY=msp_live_... # Then use OpenAI SDK normally from openai import OpenAI
client = OpenAI() # reads env vars 

New · v3.1

Evaluation framework

Before you switch routing configurations, prove quality is maintained. Upload a dataset of representative prompts, run them against multiple models simultaneously, and score outputs with LLM-as-judge scoring.

Scores are tracked over time so you can see quality trends as providers update their models.

CSV or API dataset upload (up to 1,000 items)
Run against up to 6 models in parallel
LLM-as-judge scoring (0.0–1.0) with reasoning
Exact match mode for deterministic tasks
Per-model quality × cost × latency comparison
Link eval runs to prompt versions

Sample eval result

gpt-4o-mini

91 $0.0003

claude-haiku

88 $0.0004

gemini-flash

83 $0.0002

llama-4-scout

79 $0.0001

New · v3.1

customer-support · v1.4.0

1.4.0 production Today

1.3.0 archived 3 days ago

1.2.1 archived 1 week ago

1.2.0-draft draft Now editing

Prompt registry

System prompts are code. Treat them like it. Semantic versioning, diff views, a staging workflow, and rollback — the same controls you have on your application code.

Semantic versioning (major.minor.patch)
draft → staging → production promotion
Line-level diff between any two versions
One-click rollback to any previous version
Link to eval runs for quality validation
Token count tracking per version

New · v3.1

OpenTelemetry traces

Every execute call emits a distributed trace with child spans for each stage of the pipeline. Export to your existing observability stack via OTLP. Debug exactly why a specific request was expensive, slow, or blocked.

Root span per request + child spans per stage
Attributes: cost, tokens, tier, provider, model
OTLP HTTP export to any collector
Native integrations: Jaeger, Tempo, Datadog, New Relic, Honeycomb
30-day rolling retention in ModelSpend
In-dashboard trace viewer with waterfall

modelspend.execute 1247ms

modelspend.governance 3ms

modelspend.dlp.scan 8ms

modelspend.routing.decision 12ms

modelspend.budget.check 5ms

modelspend.bridge.execute 1219ms

Ready to reduce avoidable AI spend?

Start for free — setup takes 4 minutes. No infrastructure changes.

Route your first call View pricing

How ModelSpend compares

Purpose-built for AI spend control

Basic dashboards show you what you spent. ModelSpend tells you why, routes you to the right model, and enforces guardrails before the bill lands.

Capability	Provider dashboard	Generic gateway	LLM observability	ModelSpend
OpenAI-compatible routing gateway	Not applicable	Available	Not primary focus	Available
Cost-aware model routing	Not applicable	Partial	Not primary focus	Available
Workflow-level unit economics	Partial	Not primary focus	Partial	Available
Budget caps and approval controls	Not applicable	Not primary focus	Not primary focus	Available (Business+)
Provider health and fallback routing	Not applicable	Partial	Not primary focus	Available
Direct billing / import reconciliation	Available	Not primary focus	Partial	Available
Team and project chargeback	Not applicable	Not primary focus	Partial	Available (Enterprise)
Prompt caching and reuse savings	Not applicable	Partial	Not primary focus	Available
Security and audit trail	Partial	Partial	Partial	Available
OpenTelemetry export pipeline	Not applicable	Partial	Available	Available (Enterprise)
SIEM and DLP integration	Not applicable	Not primary focus	Partial	Available (Enterprise)
SSO/SAML, SCIM, private deployment	Not applicable	Not primary focus	Not primary focus	Available (Enterprise)

Planned features are being prioritised with Enterprise Design Partners and do not yet exist in the product. Available (Business+) and Available (Enterprise) are shipped but gated to the named plan or above — see Pricing. View full roadmap

Everything you need to master AI costs.

OpenAI-compatible proxy

Evaluation framework

Prompt registry

OpenTelemetry traces

Ready to reduce avoidable AI spend?

Purpose-built for AI spend control

Everything you need to master
AI costs.