LLM API Cost Calculator
Estimate your monthly AI spend. Adjust any value and the total updates instantly.
Estimated cost
$51.50
per month
Breakdown
Annualized
$618.00
per year
API only: $1.05 / day
100 req/day x 30 days = 3,000 req/mo
Prices verified June 2026. LLM API pricing changes frequently. Confirm current rates before budgeting: Anthropic, OpenAI, Google.
Model pricing ($/1M tokens, input / output): Claude Fable 5: $10 / $50 | Claude Opus 4.8: $5 / $25 | Claude Opus 4.7: $5 / $25 | Claude Sonnet 4.6: $3 / $15 | Claude Haiku 4.5: $1 / $5 | GPT-5.5: $5 / $30 | GPT-5.4: $2.5 / $15 | GPT-5 mini: $0.25 / $2 | GPT-5 nano: $0.05 / $0.4 | Gemini 3.1 Pro: $2 / $12 | Gemini 3.5 Flash: $1.5 / $9 | Gemini 3.1 Flash-Lite: $0.1 / $0.4 | Grok 4.3: $1.25 / $2.5 | Grok 4.1 Fast: $0.2 / $0.5 | DeepSeek V4 Pro: $1.74 / $3.48 | DeepSeek V4 Flash: $0.14 / $0.28 | Mistral Large: $2 / $6
Estimates are illustrative. Your actual cost depends on caching, batching, and usage patterns.
How the estimate is calculated
The LLM API cost formula is straightforward: multiply your average input token count by the model input price, add the output token count multiplied by the output price, divide by one million, then multiply by your daily request volume and 30 for the month.
monthly API cost = ((inputTokens * inputPricePerM + outputTokens * outputPricePerM) / 1,000,000) * requestsPerDay * 30
Add your monthly hosting and SaaS subscription costs to get a full picture of what your AI stack costs to run.
Current model prices
| Model | Provider | Input ($/1M tokens) | Output ($/1M tokens) |
|---|---|---|---|
| Claude Fable 5 | Anthropic | $10 | $50 |
| Claude Opus 4.8 | Anthropic | $5 | $25 |
| Claude Opus 4.7 | Anthropic | $5 | $25 |
| Claude Sonnet 4.6 | Anthropic | $3 | $15 |
| Claude Haiku 4.5 | Anthropic | $1 | $5 |
| GPT-5.5 | OpenAI | $5 | $30 |
| GPT-5.4 | OpenAI | $2.5 | $15 |
| GPT-5 mini | OpenAI | $0.25 | $2 |
| GPT-5 nano | OpenAI | $0.05 | $0.4 |
| Gemini 3.1 Pro | $2 | $12 | |
| Gemini 3.5 Flash | $1.5 | $9 | |
| Gemini 3.1 Flash-Lite | $0.1 | $0.4 | |
| Grok 4.3 | xAI | $1.25 | $2.5 |
| Grok 4.1 Fast | xAI | $0.2 | $0.5 |
| DeepSeek V4 Pro | DeepSeek | $1.74 | $3.48 |
| DeepSeek V4 Flash | DeepSeek | $0.14 | $0.28 |
| Mistral Large | Mistral | $2 | $6 |
Prices verified June 2026. LLM pricing changes frequently; always confirm at the provider pricing page before committing to a budget. Standard (non-cached, non-batch) rates shown.
Tips for keeping API costs low
- Use prompt caching. Anthropic and OpenAI both offer cached input at up to 50-90% off. If your system prompts are large and repeated, caching makes a significant difference.
- Choose the right model tier. For high-volume, lower-complexity tasks, models like GPT-5 nano, Haiku, or Gemini Flash-Lite cut costs by an order of magnitude versus flagship models.
- Use Batch APIs for async work. Most providers offer 50% discounts for batch processing. If latency is not critical, route work through the batch endpoint.
- Trim your prompts. Unnecessary context in system prompts adds up. Audit your inputs periodically and strip anything that does not affect output quality.
- Get the prompt right the first time. A vague prompt burns tokens on retries and back-and-forth. Tighten it with FixMyPrompt before you spend the call, so you pay for one good answer instead of three mediocre ones.