← All models
A

Claude 3.5 Haiku

Anthropic

Previous generation fast model — great balance of speed and intelligence at low cost

Input price$0.80 / 1M tokens
Output price$4.00 / 1M tokens
Context window200k tokens
Last updated2026-04-22

Quick calculator

tokens
tokens
req/day
Per request
$0.002800
Daily
$28.00
Monthly
$840.00
per month · 30-day estimate
Yearly
$10,220.00

Tips to reduce cost

  • Use prompt caching to reuse repeated system prompts
  • Trim whitespace and reduce verbose instructions
  • Use a smaller model for classification or routing tasks
  • Batch async requests to get 50% discount (OpenAI/Anthropic)
  • Cache identical requests at the application layer

Similar models from Anthropic

Compared at your current token settings

About Claude 3.5 Haiku

Claude 3.5 Haiku is a mid-range large language model from anthropic, priced at $0.8/1M input tokens and $4/1M output tokens. It is 69% cheaper than the market average and best suited for fast low-cost tasks. The 200k context window handles long documents, extended conversations, and large code files comfortably.

For most production workloads, the cost breakdown is dominated by input tokens (system prompts, context, retrieved documents) rather than output. At this price point, Claude 3.5 Haiku is a solid choice when balancing quality and cost at scale.

Claude 3.5 Haiku supports prompt caching at $0.08/1M — a 90% discount on repeated input tokens. For applications with a fixed system prompt or repeated document context (RAG, chatbots, agents), enabling caching is the single highest-leverage cost optimization available.

Frequently Asked Questions

How much does Claude 3.5 Haiku cost per 1,000 tokens?
Claude 3.5 Haiku costs $0.0008 per 1,000 input tokens and $0.0040 per 1,000 output tokens.
What is Claude 3.5 Haiku's context window?
Claude 3.5 Haiku supports a context window of 200k tokens, which is suitable for long documents and multi-turn conversations.
How does Claude 3.5 Haiku compare to GPT-4o on price?
Claude 3.5 Haiku is 69% cheaper than the market average on input tokens. At $0.8/1M input vs $2.50/1M for GPT-4o, the cost difference becomes significant at scale — 10,000 requests/day with 1,000 input tokens each costs $240/month with Claude 3.5 Haiku vs $750/month with GPT-4o.
Does Claude 3.5 Haiku support prompt caching?
Yes. Claude 3.5 Haiku supports prompt caching at $0.08/1M tokens — a 90% discount on repeated input. This is especially effective for RAG pipelines and chatbots with large system prompts that repeat across requests.

Compare Claude 3.5 Haiku with other models

Claude 3.5 Haiku vs DeepSeek R2Claude 3.5 Haiku vs GPT-5 MiniClaude 3.5 Haiku vs DeepSeek R1Claude 3.5 Haiku vs o4-miniClaude 3.5 Haiku vs GPT-3.5 TurboClaude 3.5 Haiku vs Gemini 3 Flash