2 plans compared · From $3 · ★ 4.7/5
Best for: Use prompt caching aggressively in dev to cut costs on repeated system prompts and long-context test runs
Best for: Profile your average context length before committing - a 100K token context at $3/1M adds $0
Claude 3.7 Sonnet costs $3 per 1M input tokens and $15 per 1M output tokens with no subscription required. Token costs compound quickly on long-context tasks since you pay for every token in the context window. Prompt caching reduces repeat costs significantly for stable system prompts.
Use prompt caching aggressively in dev to cut costs on repeated system prompts and long-context test runs.
Profile your average context length before committing - a 100K token context at $3/1M adds $0.30 per call to your baseline.
For bulk processing, benchmark Mistral Small or Claude Haiku first - they handle many tasks at a fraction of Sonnet's cost.
80% below the llm average
At $3/$15, it's priced similarly to GPT-4o but tends to outperform on code and document tasks - worth the cost when quality matters more than price.