← All models
G
Gemini 2.5 Flash-Lite
Google
Most cost-efficient Gemini model for high-volume, latency-sensitive workloads
Input price$0.10 / 1M tokens
Output price$0.40 / 1M tokens
Context window1M tokens
Last updated2026-04-20
Quick calculator
Per request
$0.000300
Daily
$3.00
Monthly
$90.00
per month · 30-day estimate
Yearly
$1,095.00
Tips to reduce cost
- →Use prompt caching to reuse repeated system prompts
- →Trim whitespace and reduce verbose instructions
- →Use a smaller model for classification or routing tasks
- →Batch async requests to get 50% discount (OpenAI/Anthropic)
- →Cache identical requests at the application layer
Similar models from Google
Compared at your current token settings
About Gemini 2.5 Flash-Lite
Gemini 2.5 Flash-Lite is a budget large language model from google, priced at $0.1/1M input tokens and $0.4/1M output tokens. It is 96% cheaper than the market average and best suited for ultra high volume. The 1M context window makes it suitable for very long documents, large codebases, and book-length inputs.
For most production workloads, the cost breakdown is dominated by input tokens (system prompts, context, retrieved documents) rather than output. At this price point, Gemini 2.5 Flash-Lite is one of the most cost-effective options for high-volume tasks.
Frequently Asked Questions
How much does Gemini 2.5 Flash-Lite cost per 1,000 tokens?
Gemini 2.5 Flash-Lite costs $0.0001 per 1,000 input tokens and $0.0004 per 1,000 output tokens.
What is Gemini 2.5 Flash-Lite's context window?
Gemini 2.5 Flash-Lite supports a context window of 1M tokens, which is suitable for very long documents, large codebases, and extended multi-turn conversations.
How does Gemini 2.5 Flash-Lite compare to GPT-4o on price?
Gemini 2.5 Flash-Lite is 96% cheaper than the market average on input tokens. At $0.1/1M input vs $2.50/1M for GPT-4o, the cost difference becomes significant at scale — 10,000 requests/day with 1,000 input tokens each costs $30/month with Gemini 2.5 Flash-Lite vs $750/month with GPT-4o.