LLM API Price History (2023–2026)

LLM API costs have fallen dramatically since 2023 — frontier models that cost $30/M input tokens now cost $2–3/M for equivalent capability. This page tracks every major price change across OpenAI, Anthropic, Google, and DeepSeek.

GPT-4 → GPT-4o

−92%

$30/M → $2.50/M

Mar 2023 → Aug 2024

Claude Opus

−67%

$15/M → $5/M

Mar 2024 → Apr 2026

Gemini Flash

−71%

$0.35/M → $0.10/M

May 2024 → 2026

DeepSeek vs GPT-4o

−89%

$2.50/M → $0.27/M

DeepSeek V3 Jan 2025

Timeline of Major Price Events

Every significant model launch or price change that shifted the market.

Mar 2023

GPT-4 launches at $30/M input

OpenAI's GPT-4 sets the benchmark for frontier AI — at a price that makes large-scale deployment prohibitive for most teams.

Nov 2023

GPT-4 Turbo cuts price by 67%−67%

OpenAI launches GPT-4 Turbo at $10/M input — the first major price cut signals that frontier AI costs will fall fast.

Mar 2024

Anthropic launches Claude 3 family

Claude 3 Opus ($15/M), Sonnet ($3/M), and Haiku ($0.25/M) establish the three-tier pricing model still used today.

May 2024

GPT-4o: multimodal + 50% cheaper−50%

GPT-4o replaces GPT-4 Turbo with audio/vision capabilities and a 50% price cut to $5/M input. Multimodal AI becomes mainstream.

May 2024

Gemini 1.5 Flash: 1M context at $0.35/M

Google's Flash tier debuts with a 1-million-token context window at an aggressive price point, forcing competitors to respond.

Jul 2024

GPT-4o Mini: frontier quality, $0.15/M

GPT-4o Mini delivers near-GPT-4o quality at GPT-3.5 prices. Small model pricing enters the sub-$0.20 territory.

Aug 2024

OpenAI cuts GPT-4o price by 50%−50%

GPT-4o drops from $5/M to $2.50/M input. The flagship model is now 8× cheaper than GPT-4 was just 17 months earlier.

Oct 2024

Google cuts Gemini Flash 79%−79%

Gemini 1.5 Flash drops to $0.075/M — the most aggressive single price cut in LLM history. Flash reaches near-zero territory.

Jan 2025

DeepSeek V3 + R1 shock the market

DeepSeek releases V3 (GPT-4o quality at $0.27/M) and R1 (o1-class reasoning at $0.55/M) as open-weight models. The industry is forced to re-examine pricing assumptions.

Mar 2025

Gemini 2.5 Pro adds thinking at $1.25/M

Google's Gemini 2.5 Pro matches o1-class reasoning with a 1M context window, competitively priced against OpenAI's reasoning models.

Apr 2026

Claude Opus 4.7 drops 67% to $5/M−67%

Anthropic cuts Opus pricing from $15/M to $5/M while adding a 1M context window — the biggest single Anthropic price cut.

Jun 2026

Claude Fable 5 launches at $10/M

Anthropic's most capable model debuts with adaptive thinking always enabled, 1M context, 128k output. Priced between Sonnet and the retired Opus 4.5.

Price Tables by Model Family

OpenAI GPT-4 tier

−73% input since launch

$30

$10

$2.5

Date	Model	Input /1M	Output /1M	Notes
Mar 2023	GPT-4	$30	$60	GPT-4 launch — $30/M input, $60/M output
Nov 2023	GPT-4 Turbo	$10−67%	$30	GPT-4 Turbo launch — 67% cheaper than GPT-4
May 2024	GPT-4o	$5−50%	$15	GPT-4o launch — multimodal, faster, 50% cheaper than Turbo
Aug 2024	GPT-4o (cut)	$2.5−50%	$10	OpenAI cuts GPT-4o price by 50% — now $2.50/M input
Apr 2026	GPT-4.1	$2−20%	$8	GPT-4.1 with 1M context — 20% cheaper than GPT-4o
May 2026	GPT-5CURRENT	$8+300%	$32	GPT-5 launch — frontier capability step-change, premium pricing

OpenAI GPT mini tier

$0.5

$0.15

$0.4

$0.6

Date	Model	Input /1M	Output /1M	Notes
Mar 2023	GPT-3.5 Turbo	$0.5	$1.5	GPT-3.5 Turbo — the original affordable model
Jul 2024	GPT-4o Mini	$0.15−70%	$0.6	GPT-4o Mini launch — GPT-4 quality at GPT-3.5 price
Apr 2026	GPT-4.1 Mini	$0.4+167%	$1.6	GPT-4.1 Mini — 1M context at affordable price
May 2026	GPT-5 MiniCURRENT	$0.6+50%	$2.4	GPT-5 Mini — frontier intelligence for everyday tasks

Anthropic Claude Opus tier

−33% input since launch

$15

$10

Date	Model	Input /1M	Output /1M	Notes
Mar 2024	Claude 3 Opus	$15	$75	Claude 3 Opus launch — Anthropic's first frontier-class model
Feb 2025	Claude Opus 4.5	$15	$75	Claude Opus 4.5 — same price tier, extended thinking added
Apr 2026	Claude Opus 4.7	$5−67%	$25	Claude Opus 4.7 — 67% price cut, 1M context, agentic coding
Jun 2026	Claude Opus 4.8	$5	$25	Claude Opus 4.8 — improved reasoning, same price as 4.7
Jun 2026	Claude Fable 5CURRENT	$10+100%	$50	Claude Fable 5 — most capable widely released model, adaptive thinking

Anthropic Claude Sonnet tier

Date	Model	Input /1M	Output /1M	Notes
Mar 2024	Claude 3 Sonnet	$3	$15	Claude 3 Sonnet launch
Jun 2024	Claude 3.5 Sonnet	$3	$15	Claude 3.5 Sonnet — major capability jump, same price
Nov 2025	Claude Sonnet 4.5	$3	$15	Claude Sonnet 4.5 — improved tool use and coding
Apr 2026	Claude Sonnet 4.6CURRENT	$3	$15	Claude Sonnet 4.6 — 1M context, stable $3/$15 price point

Google Gemini Pro tier

−64% input since launch

$3.5

$1.25

$2.5

Date	Model	Input /1M	Output /1M	Notes
Feb 2024	Gemini 1.0 Ultra	$7	$21	Gemini 1.0 Ultra — Google's first frontier model
May 2024	Gemini 1.5 Pro	$3.5−50%	$10.5	Gemini 1.5 Pro — 1M context window debut
Oct 2024	Gemini 1.5 Pro (cut)	$1.25−64%	$5	Google cuts Gemini 1.5 Pro price by 64%
Mar 2025	Gemini 2.5 Pro	$1.25	$10	Gemini 2.5 Pro launch — thinking model, 1M context
May 2026	Gemini 3 ProCURRENT	$2.5+100%	$10	Gemini 3 Pro — next-gen multimodal reasoning

Google Gemini Flash tier

−57% input since launch

$0.35

$0.075

$0.1

$0.15

Date	Model	Input /1M	Output /1M	Notes
May 2024	Gemini 1.5 Flash	$0.35	$1.05	Gemini Flash debut — fast, cheap, 1M context
Oct 2024	Gemini Flash (cut)	$0.075−79%	$0.3	Google cuts Flash price 79% — most aggressive price cut in LLM history
Mar 2025	Gemini 2.5 Flash	$0.1+33%	$0.4	Gemini 2.5 Flash — thinking capability at ultra-low price
May 2026	Gemini 3 FlashCURRENT	$0.15+50%	$0.6	Gemini 3 Flash — next-gen at competitive price

DeepSeek

$0.14

$0.27

$0.55

$0.4

Date	Model	Input /1M	Output /1M	Notes
May 2024	DeepSeek V2	$0.14	$0.28	DeepSeek V2 MoE — shook the market with ultra-low pricing
Jan 2025	DeepSeek V3	$0.27+93%	$1.1	DeepSeek V3 — GPT-4o-class quality at <$0.30/M input, shook the market
Jan 2025	DeepSeek R1	$0.55+104%	$2.19	DeepSeek R1 — o1-class reasoning at 20× lower price
May 2026	DeepSeek R2CURRENT	$0.4−27%	$1.6	DeepSeek R2 — improved reasoning, competitive with o3

What LLM Price History Tells Us

The cost of frontier AI has fallen faster than almost any technology in history. GPT-4 launched in March 2023 at $30 per million input tokens. By August 2024 — just 17 months later — GPT-4o offered comparable or better capability at $2.50/M, a 91% price reduction. Gemini Flash saw a 79% price cut in a single announcement in October 2024. These are not incremental improvements — they are structural collapses in the cost of AI.

The DeepSeek inflection point. January 2025 marked a turning point when DeepSeek released V3 and R1 as open-weight models. V3 matched GPT-4o on key benchmarks at $0.27/M input — roughly 9× cheaper. R1 matched OpenAI o1 on reasoning benchmarks at $0.55/M — compared to o1's $15/M. DeepSeek demonstrated that the cost premium for frontier AI was partly structural (massive infrastructure investment) and partly artificial (lack of competition). Western providers have been forced to respond with price cuts and open-weight releases of their own.

The three-tier pricing model is stabilizing. Across all major providers, a consistent pricing hierarchy has emerged: a cheap fast tier ($0.10–1.00/M input), a balanced mid tier ($2–5/M), and a premium frontier tier ($8–15/M). Anthropic's Haiku/Sonnet/Opus, OpenAI's Mini/standard/o-series, and Google's Flash/Pro/Ultra all follow this pattern. Competition within each tier is intense, but providers are differentiating on features (context window, caching, reasoning) rather than purely on price.

What to expect in 2026 and beyond. Historical trends suggest prices will continue falling 50–80% every 12–18 months, driven by hardware improvements (faster GPUs, custom TPUs), model efficiency gains (smaller models matching larger ones), and competitive pressure. Applications that seem economically borderline today — processing millions of documents, running AI assistants for entire organizations, real-time AI in consumer apps — will become viable within the next 1–2 years purely from price reductions, even without any other optimization.

Cheapest LLM API today →Best value LLM →Compare any two models →Read our guides →