LLM API Price History (2023–2026)
LLM API costs have fallen dramatically since 2023 — frontier models that cost $30/M input tokens now cost $2–3/M for equivalent capability. This page tracks every major price change across OpenAI, Anthropic, Google, and DeepSeek.
Timeline of Major Price Events
Every significant model launch or price change that shifted the market.
OpenAI's GPT-4 sets the benchmark for frontier AI — at a price that makes large-scale deployment prohibitive for most teams.
OpenAI launches GPT-4 Turbo at $10/M input — the first major price cut signals that frontier AI costs will fall fast.
Claude 3 Opus ($15/M), Sonnet ($3/M), and Haiku ($0.25/M) establish the three-tier pricing model still used today.
GPT-4o replaces GPT-4 Turbo with audio/vision capabilities and a 50% price cut to $5/M input. Multimodal AI becomes mainstream.
Google's Flash tier debuts with a 1-million-token context window at an aggressive price point, forcing competitors to respond.
GPT-4o Mini delivers near-GPT-4o quality at GPT-3.5 prices. Small model pricing enters the sub-$0.20 territory.
GPT-4o drops from $5/M to $2.50/M input. The flagship model is now 8× cheaper than GPT-4 was just 17 months earlier.
Gemini 1.5 Flash drops to $0.075/M — the most aggressive single price cut in LLM history. Flash reaches near-zero territory.
DeepSeek releases V3 (GPT-4o quality at $0.27/M) and R1 (o1-class reasoning at $0.55/M) as open-weight models. The industry is forced to re-examine pricing assumptions.
Google's Gemini 2.5 Pro matches o1-class reasoning with a 1M context window, competitively priced against OpenAI's reasoning models.
Anthropic cuts Opus pricing from $15/M to $5/M while adding a 1M context window — the biggest single Anthropic price cut.
Anthropic's most capable model debuts with adaptive thinking always enabled, 1M context, 128k output. Priced between Sonnet and the retired Opus 4.5.
Price Tables by Model Family
What LLM Price History Tells Us
The cost of frontier AI has fallen faster than almost any technology in history. GPT-4 launched in March 2023 at $30 per million input tokens. By August 2024 — just 17 months later — GPT-4o offered comparable or better capability at $2.50/M, a 91% price reduction. Gemini Flash saw a 79% price cut in a single announcement in October 2024. These are not incremental improvements — they are structural collapses in the cost of AI.
The DeepSeek inflection point. January 2025 marked a turning point when DeepSeek released V3 and R1 as open-weight models. V3 matched GPT-4o on key benchmarks at $0.27/M input — roughly 9× cheaper. R1 matched OpenAI o1 on reasoning benchmarks at $0.55/M — compared to o1's $15/M. DeepSeek demonstrated that the cost premium for frontier AI was partly structural (massive infrastructure investment) and partly artificial (lack of competition). Western providers have been forced to respond with price cuts and open-weight releases of their own.
The three-tier pricing model is stabilizing. Across all major providers, a consistent pricing hierarchy has emerged: a cheap fast tier ($0.10–1.00/M input), a balanced mid tier ($2–5/M), and a premium frontier tier ($8–15/M). Anthropic's Haiku/Sonnet/Opus, OpenAI's Mini/standard/o-series, and Google's Flash/Pro/Ultra all follow this pattern. Competition within each tier is intense, but providers are differentiating on features (context window, caching, reasoning) rather than purely on price.
What to expect in 2026 and beyond. Historical trends suggest prices will continue falling 50–80% every 12–18 months, driven by hardware improvements (faster GPUs, custom TPUs), model efficiency gains (smaller models matching larger ones), and competitive pressure. Applications that seem economically borderline today — processing millions of documents, running AI assistants for entire organizations, real-time AI in consumer apps — will become viable within the next 1–2 years purely from price reductions, even without any other optimization.