← All models
G

Gemini 2.0 Flash

Google

Previous gen workhorse — fast multimodal model with excellent price-to-performance

Input price$0.10 / 1M tokens
Output price$0.40 / 1M tokens
Context window1M tokens
Last updated2026-04-22

Quick calculator

tokens
tokens
req/day
Per request
$0.000300
Daily
$3.00
Monthly
$90.00
per month · 30-day estimate
Yearly
$1,095.00

Tips to reduce cost

  • Use prompt caching to reuse repeated system prompts
  • Trim whitespace and reduce verbose instructions
  • Use a smaller model for classification or routing tasks
  • Batch async requests to get 50% discount (OpenAI/Anthropic)
  • Cache identical requests at the application layer

Similar models from Google

Compared at your current token settings

About Gemini 2.0 Flash

Gemini 2.0 Flash is a budget large language model from google, priced at $0.1/1M input tokens and $0.4/1M output tokens. It is 96% cheaper than the market average and best suited for fast multimodal tasks. The 1M context window makes it suitable for very long documents, large codebases, and book-length inputs.

For most production workloads, the cost breakdown is dominated by input tokens (system prompts, context, retrieved documents) rather than output. At this price point, Gemini 2.0 Flash is one of the most cost-effective options for high-volume tasks.

Gemini 2.0 Flash supports prompt caching at $0.025/1M — a 75% discount on repeated input tokens. For applications with a fixed system prompt or repeated document context (RAG, chatbots, agents), enabling caching is the single highest-leverage cost optimization available.

Frequently Asked Questions

How much does Gemini 2.0 Flash cost per 1,000 tokens?
Gemini 2.0 Flash costs $0.0001 per 1,000 input tokens and $0.0004 per 1,000 output tokens.
What is Gemini 2.0 Flash's context window?
Gemini 2.0 Flash supports a context window of 1M tokens, which is suitable for very long documents, large codebases, and extended multi-turn conversations.
How does Gemini 2.0 Flash compare to GPT-4o on price?
Gemini 2.0 Flash is 96% cheaper than the market average on input tokens. At $0.1/1M input vs $2.50/1M for GPT-4o, the cost difference becomes significant at scale — 10,000 requests/day with 1,000 input tokens each costs $30/month with Gemini 2.0 Flash vs $750/month with GPT-4o.
Does Gemini 2.0 Flash support prompt caching?
Yes. Gemini 2.0 Flash supports prompt caching at $0.025/1M tokens — a 75% discount on repeated input. This is especially effective for RAG pipelines and chatbots with large system prompts that repeat across requests.

Compare Gemini 2.0 Flash with other models

Gemini 2.0 Flash vs GPT-4.1 NanoGemini 2.0 Flash vs Mistral Small 3.1Gemini 2.0 Flash vs Mistral NemoGemini 2.0 Flash vs Qwen3 30BGemini 2.0 Flash vs Qwen3 235BGemini 2.0 Flash vs GPT-4o Mini