Models

Every LLM, STT, and TTS model — context windows, latency, and pricing.

How to read these tables

Context is the model's context window (tokens). Latency is a rough p50 time-to-first-token used for the builder's latency hint. Input / Output are real provider list prices in USD per 1,000,000 tokens — passed through to you at cost (no markup); the platform's margin is the token price and subscription spread, not a usage fee.

OpenAI

ModelContextLatencyIn $/1MOut $/1M
GPT-5400K750 ms$1.25$10.00
GPT-5 Mini400K450 ms$0.25$2.00
GPT-5 Nano400K300 ms$0.05$0.40
GPT-4.11.05M650 ms$2.00$8.00
GPT-4o128K700 ms$2.50$10.00
GPT-4o Mini128K390 ms$0.15$0.60

Anthropic (Claude)

ModelContextLatencyIn $/1MOut $/1M
Claude Opus 4200K900 ms$15.00$75.00
Claude Sonnet 4200K600 ms$3.00$15.00
Claude Haiku 4200K400 ms$0.80$4.00

Google (Gemini)

ModelContextLatencyIn $/1MOut $/1M
Gemini 2.5 Pro1.05M700 ms$1.25$10.00
Gemini 2.5 Flash1.05M380 ms$0.30$2.50
Gemini 2.0 Flash1.05M400 ms$0.10$0.40
Gemini 2.0 Flash-Lite1.05M320 ms$0.075$0.30

Groq (fastest)

ModelContextLatencyIn $/1MOut $/1M
Llama 4 Maverick131K280 ms$0.20$0.60
Llama 4 Scout131K230 ms$0.11$0.34
Llama 3.3 70B128K250 ms$0.59$0.79
Llama 3.1 8B Instant128K180 ms$0.05$0.08
Qwen 2.5 32B128K230 ms$0.79$0.79
Tip · Groq runs open models on custom hardware — it's the lowest-latency option for real-time voice.

xAI (Grok)

ModelContextLatencyIn $/1MOut $/1M
Grok 4256K800 ms$3.00$15.00
Grok 3131K700 ms$3.00$15.00
Grok 3 Mini131K450 ms$0.30$0.50
Grok 2 Vision32K750 ms$2.00$10.00

Mistral

ModelContextLatencyIn $/1MOut $/1M
Mistral Large128K600 ms$2.00$6.00
Mistral Small32K350 ms$0.20$0.60
Codestral256K350 ms$0.30$0.90
Ministral 8B128K250 ms$0.10$0.10
Pixtral Large128K600 ms$2.00$6.00

OpenRouter

OpenRouter is a meta-provider: a single key routes to GPT-5, Claude Sonnet 4, Gemini 2.5 Flash, Llama 4 Maverick, and hundreds more. Pricing matches the underlying model. Use it to reach a model you don't have a direct key for.

Speech-to-Text (STT)

ProviderModelsCost
DeepgramNova-3, Nova-2, Nova-2 Phone Call, Enhanced, Base≈ $0.0048 / min
AssemblyAIBest, NanoUsage-based

Text-to-Speech (TTS)

ProviderModelsCost
ElevenLabsFlash v2.5, Turbo v2.5, Multilingual v2, Monolingual v1≈ $0.05 / 1k chars (~$0.04/min)
OpenAIGPT-4o mini TTS, TTS-1, TTS-1 HDUsage-based
Deepgram AuraAura-2, AuraUsage-based
CartesiaSonic-2, SonicUsage-based