Free tool
AI API Cost Calculator
Estimate your monthly spend on Claude, GPT-4o, Gemini, and other LLM APIs. Enter your token volumes and request rate; the total updates as you type.
LLM API Cost Calculator
Estimate your monthly AI spend. Adjust any value and the total updates instantly.
Estimated cost
$51.50
per month
Breakdown
Annualized
$618.00
per year
API only: $1.05 / day
100 req/day x 30 days = 3,000 req/mo
Prices verified May 2026. LLM API pricing changes frequently. Confirm current rates before budgeting: Anthropic, OpenAI, Google.
Model pricing ($/1M tokens, input / output): Claude Opus 4.7: $15 / $75 | Claude Sonnet 4.6: $3 / $15 | Claude Haiku 4.5: $1 / $5 | GPT-5: $1.25 / $10 | GPT-5 mini: $0.25 / $2 | GPT-5 nano: $0.05 / $0.4 | GPT-4o: $2.5 / $10 | GPT-4o mini: $0.15 / $0.6 | Gemini 2.5 Pro: $1.25 / $10 | Gemini 2.5 Flash: $0.3 / $2.5
Estimates are illustrative. Your actual cost depends on caching, batching, and usage patterns.
How the estimate is calculated
The LLM API cost formula is straightforward: multiply your average input token count by the model input price, add the output token count multiplied by the output price, divide by one million, then multiply by your daily request volume and 30 for the month.
monthly API cost = ((inputTokens * inputPricePerM + outputTokens * outputPricePerM) / 1,000,000) * requestsPerDay * 30
Add your monthly hosting and SaaS subscription costs to get a full picture of what your AI stack costs to run.
Current model prices
| Model | Provider | Input ($/1M tokens) | Output ($/1M tokens) |
|---|---|---|---|
| Claude Opus 4.7 | Anthropic | $15 | $75 |
| Claude Sonnet 4.6 | Anthropic | $3 | $15 |
| Claude Haiku 4.5 | Anthropic | $1 | $5 |
| GPT-5 | OpenAI | $1.25 | $10 |
| GPT-5 mini | OpenAI | $0.25 | $2 |
| GPT-5 nano | OpenAI | $0.05 | $0.4 |
| GPT-4o | OpenAI | $2.5 | $10 |
| GPT-4o mini | OpenAI | $0.15 | $0.6 |
| Gemini 2.5 Pro | $1.25 | $10 | |
| Gemini 2.5 Flash | $0.3 | $2.5 |
Prices verified May 2026. LLM pricing changes frequently; always confirm at the provider pricing page before committing to a budget. Standard (non-cached, non-batch) rates shown.
Tips for keeping API costs low
- Use prompt caching. Anthropic and OpenAI both offer cached input at up to 50-90% off. If your system prompts are large and repeated, caching makes a significant difference.
- Choose the right model tier. For high-volume, lower-complexity tasks, models like GPT-4o mini, Haiku, or Gemini Flash cut costs by an order of magnitude versus flagship models.
- Use Batch APIs for async work. Most providers offer 50% discounts for batch processing. If latency is not critical, route work through the batch endpoint.
- Trim your prompts. Unnecessary context in system prompts adds up. Audit your inputs periodically and strip anything that does not affect output quality.
- Get the prompt right the first time. A vague prompt burns tokens on retries and back-and-forth. Tighten it with FixMyPrompt before you spend the call, so you pay for one good answer instead of three mediocre ones.