Cost to Process 100,000 Tokens

Exact pricing for 100K tokens across 53 LLM APIs. Sorted cheapest first. Prices updated May 2026.

Cheapest model

Qwen3.5 Flash

$0.0060 total

Most expensive

Claude 3 Opus

$9.0000 total

Price range

1500x

difference between cheapest & most expensive

Model	Provider	Input cost	Output cost	Total
Qwen3.5 Flash	qwen	$0.0010	$0.0050	$0.0060
Llama 3.1 8B	meta	$0.0020	$0.0050	$0.0070
Qwen3 235B	qwen	$0.0060	$0.0060	$0.0120
Qwen3 8B	qwen	$0.0050	$0.0100	$0.0150
Qwen3 30B	qwen	$0.0100	$0.0150	$0.0250
Llama 4 Scout	meta	$0.0170	$0.0170	$0.0340
Gemini 2.0 Flash-Lite	google	$0.0075	$0.0300	$0.0375
Mistral Small 3.1	mistral	$0.0100	$0.0300	$0.0400
Mistral Nemo	mistral	$0.0100	$0.0300	$0.0400
GPT-4.1 Nano	openai	$0.0100	$0.0400	$0.0500
Gemini 2.5 Flash-Lite	google	$0.0100	$0.0400	$0.0500
Gemini 2.0 Flash	google	$0.0100	$0.0400	$0.0500
Gemini 3 Flash-Lite	google	$0.0120	$0.0480	$0.0600
Llama 3.3 70B	meta	$0.0230	$0.0400	$0.0630
GPT-4o Mini	openai	$0.0150	$0.0600	$0.0750
Grok 3 Mini	xai	$0.0300	$0.0500	$0.0800
Codestral	mistral	$0.0300	$0.0900	$0.1200
DeepSeek Chat	deepseek	$0.0270	$0.1100	$0.1370
Llama 4 Maverick	meta	$0.0500	$0.1100	$0.1600
GPT-4.1 Mini	openai	$0.0400	$0.1600	$0.2000
o3	openai	$0.0400	$0.1600	$0.2000
GPT-3.5 Turbo	openai	$0.0500	$0.1500	$0.2000
Mistral Large 3	mistral	$0.0500	$0.1500	$0.2000
Mistral Medium 3	mistral	$0.0400	$0.2000	$0.2400
Gemini 3 Flash	google	$0.0500	$0.2000	$0.2500
DeepSeek R1	deepseek	$0.0550	$0.2190	$0.2740
Gemini 2.5 Flash	google	$0.0300	$0.2500	$0.2800
GPT-5 Mini	openai	$0.0600	$0.2400	$0.3000
DeepSeek R2	deepseek	$0.0800	$0.3200	$0.4000
Claude 3.5 Haiku	anthropic	$0.0800	$0.4000	$0.4800
o4-mini	openai	$0.1100	$0.4400	$0.5500
Claude Haiku 4.5	anthropic	$0.1000	$0.5000	$0.6000
Gemini 1.5 Pro	google	$0.1250	$0.5000	$0.6250
Magistral Medium	mistral	$0.2000	$0.5000	$0.7000
Llama 3.1 405B	meta	$0.3500	$0.3500	$0.7000
GPT-4.1	openai	$0.2000	$0.8000	$1.0000
Gemini 2.5 Pro	google	$0.1250	$1.0000	$1.1250
GPT-4o	openai	$0.2500	$1.0000	$1.2500
Gemini 3 Pro	google	$0.3500	$1.4000	$1.7500
Claude Sonnet 4.6	anthropic	$0.3000	$1.5000	$1.8000
Claude 3.5 Sonnet	anthropic	$0.3000	$1.5000	$1.8000
Grok 3	xai	$0.3000	$1.5000	$1.8000
Claude Opus 4.8	anthropic	$0.5000	$2.5000	$3.0000
Claude Opus 4.7	anthropic	$0.5000	$2.5000	$3.0000
Claude Opus 4.6	anthropic	$0.5000	$2.5000	$3.0000
Grok 3 Fast	xai	$0.5000	$2.5000	$3.0000
GPT-5	openai	$0.8000	$3.2000	$4.0000
GPT-4 Turbo	openai	$1.0000	$3.0000	$4.0000
Gemini 3 Ultra	google	$1.0000	$3.0000	$4.0000
Claude Fable 5	anthropic	$1.0000	$5.0000	$6.0000
o1	openai	$1.5000	$6.0000	$7.5000
Claude Opus 4.5	anthropic	$1.5000	$7.5000	$9.0000
Claude 3 Opus	anthropic	$1.5000	$7.5000	$9.0000

* Total = input + output cost for 100,000 tokens each. Compare any two models →

What 100,000 Tokens Means for Your Application

100,000 tokens is a practical benchmark for understanding single-request costs. In English text, 100K tokens represents approximately 75,000 words — a full-length novel, a 250-page technical manual, or a moderately-sized codebase. For shorter requests — typical chatbot messages average 500–2,000 tokens — 100K tokens represents 50–200 individual conversations.

Per-request cost for common workloads. A typical customer service message with a 1,000-token system prompt, 500-token user message, and 300-token response totals 1,800 tokens. At GPT-4o Mini ($0.15/M input, $0.60/M output), that single interaction costs $0.000330 — less than a tenth of a cent. At GPT-4o ($2.50/M input, $10/M output), the same interaction costs $0.005500. The per-request difference is small, but at 100,000 requests per day it becomes $330/day versus $550/day — a $220 daily difference from a single model choice.

Long-context use cases. Models with large context windows — Gemini 2.5 Pro (1M tokens), Claude Opus (200K), GPT-4o (128K) — enable processing entire documents in a single API call. Analyzing a 100,000-token legal contract costs $0.125 in input tokens on Gemini Flash, $0.25 on Gemini 2.5 Pro (under 128K threshold), and $0.30 on Claude Sonnet. For single-document analysis, the absolute cost is modest even on premium models. The economics only become critical at scale — processing thousands of documents per day.

Choosing between models at similar price points. Several models cluster around similar price points for 100K tokens. When models are similarly priced, the deciding factors shift to: latency (streaming speed), context window size, structured output support, function calling reliability, and your team's familiarity with the API. Price per token is the starting filter, not the final decision criterion. Use the comparison tool to evaluate models side-by-side across the dimensions that matter for your specific workload.