Cost to Process 100,000 Tokens

Exact pricing for 100K tokens across 53 LLM APIs. Sorted cheapest first. Prices updated May 2026.

Cheapest model
Qwen3.5 Flash
$0.0060 total
Most expensive
Claude 3 Opus
$9.0000 total
Price range
1500x
difference between cheapest & most expensive
ModelProviderInput costOutput costTotal
Qwen3.5 Flashqwen$0.0010$0.0050$0.0060
Llama 3.1 8Bmeta$0.0020$0.0050$0.0070
Qwen3 235Bqwen$0.0060$0.0060$0.0120
Qwen3 8Bqwen$0.0050$0.0100$0.0150
Qwen3 30Bqwen$0.0100$0.0150$0.0250
Llama 4 Scoutmeta$0.0170$0.0170$0.0340
Gemini 2.0 Flash-Litegoogle$0.0075$0.0300$0.0375
Mistral Small 3.1mistral$0.0100$0.0300$0.0400
Mistral Nemomistral$0.0100$0.0300$0.0400
GPT-4.1 Nanoopenai$0.0100$0.0400$0.0500
Gemini 2.5 Flash-Litegoogle$0.0100$0.0400$0.0500
Gemini 2.0 Flashgoogle$0.0100$0.0400$0.0500
Gemini 3 Flash-Litegoogle$0.0120$0.0480$0.0600
Llama 3.3 70Bmeta$0.0230$0.0400$0.0630
GPT-4o Miniopenai$0.0150$0.0600$0.0750
Grok 3 Minixai$0.0300$0.0500$0.0800
Codestralmistral$0.0300$0.0900$0.1200
DeepSeek Chatdeepseek$0.0270$0.1100$0.1370
Llama 4 Maverickmeta$0.0500$0.1100$0.1600
GPT-4.1 Miniopenai$0.0400$0.1600$0.2000
o3openai$0.0400$0.1600$0.2000
GPT-3.5 Turboopenai$0.0500$0.1500$0.2000
Mistral Large 3mistral$0.0500$0.1500$0.2000
Mistral Medium 3mistral$0.0400$0.2000$0.2400
Gemini 3 Flashgoogle$0.0500$0.2000$0.2500
DeepSeek R1deepseek$0.0550$0.2190$0.2740
Gemini 2.5 Flashgoogle$0.0300$0.2500$0.2800
GPT-5 Miniopenai$0.0600$0.2400$0.3000
DeepSeek R2deepseek$0.0800$0.3200$0.4000
Claude 3.5 Haikuanthropic$0.0800$0.4000$0.4800
o4-miniopenai$0.1100$0.4400$0.5500
Claude Haiku 4.5anthropic$0.1000$0.5000$0.6000
Gemini 1.5 Progoogle$0.1250$0.5000$0.6250
Magistral Mediummistral$0.2000$0.5000$0.7000
Llama 3.1 405Bmeta$0.3500$0.3500$0.7000
GPT-4.1openai$0.2000$0.8000$1.0000
Gemini 2.5 Progoogle$0.1250$1.0000$1.1250
GPT-4oopenai$0.2500$1.0000$1.2500
Gemini 3 Progoogle$0.3500$1.4000$1.7500
Claude Sonnet 4.6anthropic$0.3000$1.5000$1.8000
Claude 3.5 Sonnetanthropic$0.3000$1.5000$1.8000
Grok 3xai$0.3000$1.5000$1.8000
Claude Opus 4.8anthropic$0.5000$2.5000$3.0000
Claude Opus 4.7anthropic$0.5000$2.5000$3.0000
Claude Opus 4.6anthropic$0.5000$2.5000$3.0000
Grok 3 Fastxai$0.5000$2.5000$3.0000
GPT-5openai$0.8000$3.2000$4.0000
GPT-4 Turboopenai$1.0000$3.0000$4.0000
Gemini 3 Ultragoogle$1.0000$3.0000$4.0000
Claude Fable 5anthropic$1.0000$5.0000$6.0000
o1openai$1.5000$6.0000$7.5000
Claude Opus 4.5anthropic$1.5000$7.5000$9.0000
Claude 3 Opusanthropic$1.5000$7.5000$9.0000

* Total = input + output cost for 100,000 tokens each. Compare any two models →

What 100,000 Tokens Means for Your Application

100,000 tokens is a practical benchmark for understanding single-request costs. In English text, 100K tokens represents approximately 75,000 words — a full-length novel, a 250-page technical manual, or a moderately-sized codebase. For shorter requests — typical chatbot messages average 500–2,000 tokens — 100K tokens represents 50–200 individual conversations.

Per-request cost for common workloads. A typical customer service message with a 1,000-token system prompt, 500-token user message, and 300-token response totals 1,800 tokens. At GPT-4o Mini ($0.15/M input, $0.60/M output), that single interaction costs $0.000330 — less than a tenth of a cent. At GPT-4o ($2.50/M input, $10/M output), the same interaction costs $0.005500. The per-request difference is small, but at 100,000 requests per day it becomes $330/day versus $550/day — a $220 daily difference from a single model choice.

Long-context use cases. Models with large context windows — Gemini 2.5 Pro (1M tokens), Claude Opus (200K), GPT-4o (128K) — enable processing entire documents in a single API call. Analyzing a 100,000-token legal contract costs $0.125 in input tokens on Gemini Flash, $0.25 on Gemini 2.5 Pro (under 128K threshold), and $0.30 on Claude Sonnet. For single-document analysis, the absolute cost is modest even on premium models. The economics only become critical at scale — processing thousands of documents per day.

Choosing between models at similar price points. Several models cluster around similar price points for 100K tokens. When models are similarly priced, the deciding factors shift to: latency (streaming speed), context window size, structured output support, function calling reliability, and your team's familiarity with the API. Price per token is the starting filter, not the final decision criterion. Use the comparison tool to evaluate models side-by-side across the dimensions that matter for your specific workload.