DeepSeek API Pricing

Updated May 2026 · 3 models

DeepSeek offers frontier-quality models at a fraction of the cost of GPT-4o and Claude — making it one of the best choices for cost-sensitive production workloads.

DeepSeek R2 is ~3x cheaper than GPT-4o on input tokens

$0.8/1M input vs $2.5/1M for GPT-4o

DeepSeek R1 is ~5x cheaper than GPT-4o on input tokens

$0.55/1M input vs $2.5/1M for GPT-4o

DeepSeek Chat is ~9x cheaper than GPT-4o on input tokens

$0.27/1M input vs $2.5/1M for GPT-4o

DeepSeek R2CACHED

DeepSeek's second-generation reasoning model — stronger than R1 across all benchmarks at similar cost

$0.8

input /1M

$0.2

cached /1M

$3.2

output /1M

DeepSeek R1CACHED

Open-source reasoning model matching o1-level performance at a fraction of the cost. Ideal for math, coding, logic.

$0.55

input /1M

$0.14

cached /1M

$2.19

output /1M

DeepSeek ChatCACHED

Cost-efficient chat model with strong multilingual performance. Best price-to-quality for Asian languages.

$0.27

input /1M

$0.07

cached /1M

$1.1

output /1M

Compare DeepSeek vs Other Providers

vs OpenAI GPT vs Anthropic Claude DeepSeek Chat vs GPT-4o DeepSeek R1 vs Claude Sonnet

Understanding DeepSeek API Pricing

DeepSeek disrupted the LLM market in early 2025 by releasing open-weight models that matched frontier performance at a fraction of the cost. DeepSeek R1, their reasoning model, achieved benchmark results competitive with OpenAI o1 while costing approximately 20–30× less at standard API rates. DeepSeek V3 (marketed as “DeepSeek Chat” via the API) offers GPT-4o-tier instruction following and coding at $0.27/M input tokens — roughly 9× cheaper than GPT-4o.

DeepSeek R2 vs R1 vs Chat. DeepSeek R2 is their latest reasoning model, offering improved performance over R1 with similar pricing. Both R1 and R2 use chain-of-thought reasoning and are best suited for mathematical problem-solving, complex coding tasks, and multi-step logical reasoning — analogous to OpenAI's o-series models. DeepSeek Chat (V3) is optimized for instruction following, content generation, and general-purpose tasks — more comparable to GPT-4o than to o3.

Context caching on DeepSeek. DeepSeek supports disk-cached inputs at a significant discount — cached tokens can cost as little as 10% of standard rates for content that has been pre-cached. This makes DeepSeek especially cost-effective for RAG pipelines and agent systems with large, repeated context. At $0.07/M for cached DeepSeek Chat input (estimate), the effective cost for cache-heavy workloads is among the lowest available from any production API.

Risks and considerations. DeepSeek is a Chinese company, and their API routes data through servers in China. For applications handling sensitive data, privacy regulations (GDPR, HIPAA, SOC 2), or data residency requirements, using DeepSeek's API directly may create compliance challenges. The models are open-weight, however, so they can be self-hosted on your own infrastructure via providers like Together AI, Fireworks, or Groq — which eliminates the data routing concern while preserving the cost advantage.

When DeepSeek is the right choice. DeepSeek is ideal for non-sensitive, high-volume workloads where cost is a primary constraint and output quality is strong enough: content generation, coding assistance, data extraction, classification, and summarization. For teams that need frontier-quality reasoning at the lowest possible price point, DeepSeek R1/R2 via a US-based inference provider (Together AI, Fireworks) offers the best cost-performance ratio available in 2026.