Free tool

AI API Cost Calculator

Estimate your monthly spend on Claude, GPT-4o, Gemini, and other LLM APIs. Enter your token volumes and request rate; the total updates as you type.

LLM API Cost Calculator

Estimate your monthly AI spend. Adjust any value and the total updates instantly.

1. Choose a model

Anthropic

OpenAI

Google

Claude Sonnet 4.6: $3/M input, $15/M output

2. Usage volume

Avg input tokens

~750 words = 1,000 tokens

Avg output tokens

~375 words = 500 tokens

Requests per day

Across all users or automated calls

3. Optional add-ons

Hosting / infra (monthly)

Server, database, CDN

Tool subscriptions (monthly)

Other SaaS tools in your stack

Estimated cost

$51.50

per month

Breakdown

LLM API$31.50

Hosting / infra$20.00

Tool subscriptions$0.00

Annualized

$618.00

per year

API only: $1.05 / day

100 req/day x 30 days = 3,000 req/mo

Prices verified May 2026. LLM API pricing changes frequently. Confirm current rates before budgeting: Anthropic, OpenAI, Google.

Estimates are illustrative. Your actual cost depends on caching, batching, and usage patterns.

How the estimate is calculated

The LLM API cost formula is straightforward: multiply your average input token count by the model input price, add the output token count multiplied by the output price, divide by one million, then multiply by your daily request volume and 30 for the month.

monthly API cost =
  ((inputTokens * inputPricePerM + outputTokens * outputPricePerM) / 1,000,000)
  * requestsPerDay * 30

Add your monthly hosting and SaaS subscription costs to get a full picture of what your AI stack costs to run.

Current model prices

Model	Provider	Input ($/1M tokens)	Output ($/1M tokens)
Claude Opus 4.7	Anthropic	$15	$75
Claude Sonnet 4.6	Anthropic	$3	$15
Claude Haiku 4.5	Anthropic	$1	$5
GPT-5	OpenAI	$1.25	$10
GPT-5 mini	OpenAI	$0.25	$2
GPT-5 nano	OpenAI	$0.05	$0.4
GPT-4o	OpenAI	$2.5	$10
GPT-4o mini	OpenAI	$0.15	$0.6
Gemini 2.5 Pro	Google	$1.25	$10
Gemini 2.5 Flash	Google	$0.3	$2.5

Prices verified May 2026. LLM pricing changes frequently; always confirm at the provider pricing page before committing to a budget. Standard (non-cached, non-batch) rates shown.

Tips for keeping API costs low

Use prompt caching. Anthropic and OpenAI both offer cached input at up to 50-90% off. If your system prompts are large and repeated, caching makes a significant difference.
Choose the right model tier. For high-volume, lower-complexity tasks, models like GPT-4o mini, Haiku, or Gemini Flash cut costs by an order of magnitude versus flagship models.
Use Batch APIs for async work. Most providers offer 50% discounts for batch processing. If latency is not critical, route work through the batch endpoint.
Trim your prompts. Unnecessary context in system prompts adds up. Audit your inputs periodically and strip anything that does not affect output quality.
Get the prompt right the first time. A vague prompt burns tokens on retries and back-and-forth. Tighten it with FixMyPrompt before you spend the call, so you pay for one good answer instead of three mediocre ones.