Cohere Pricing: Plans & Features 2026
A free trial then usage-based rates across Command, Embed, and Rerank. The model and endpoint you call set the bill, not any plan name.
Cohere plans and pricing
Command R7B
$0.04/per 1M input tokens- ✓Cheapest production API
- ✓128K token context window
- ✓Output tokens $0.15/1M
Command R
$0.15/per 1M input tokens- ✓RAG-optimized mid-tier
- ✓128K token context window
- ✓Strong long-document recall
- ✓Output tokens $0.60/1M
Command A
$2.5/per 1M input tokens- ✓Flagship model
- ✓128K token context window
- ✓Tuned for long-document RAG
- ✓Output tokens $10.00/1M
Cohere pricing: the quick answer
Cohere is pay-as-you-go across its catalog rather than a flat subscription, with a free trial key to start. Command A is the flagship, Command R7B the cheap low-latency model at fractions of a cent per 1M input tokens, and Embed and Rerank each bill on their own scale, Rerank by search unit rather than by token. A production key bills at month end or once you cross an outstanding threshold. The plan name barely matters here; the model and endpoint you call decide the invoice.
- Command R7B$0.04/per 1M input tokens
- Embed v3$0.10/per 1M input tokens
- Command R$0.15/per 1M input tokens
- Rerank v3$2/per 1M tokens
- Command A$2.50/per 1M input tokens
At $0.04/mo to start, Cohere sits 100% below the $9/mo median across 10 large language models tools we track.
Cohere cost calculator
What Cohere really costs
Cohere's real cost lives in the rate table, not a subscription fee. Which model and endpoint you call decides the bill far more than any plan name.
Pricing Expert Take
Independent analysis · Cohere
Value Analysis
Cohere skips the flat monthly seat and charges you per model and endpoint, which is the right structure for search-heavy work and the wrong one to eyeball. Command R7B is aggressively cheap for high-volume, low-latency calls, while the flagship Command A carries the demanding reasoning at a higher per-token rate. The lever that decides most bills is not the generate step at all, it is Rerank, priced by search unit, so a pipeline that reranks everything spends very differently from one that reranks selectively. Paired thoughtfully, R7B for volume and Command A for the hard calls, the pricing is efficient for RAG.
Hidden Costs
- Rerank scales by search unit and multilingual rerank models can run higher per 1,000 units, so heavy reranking is where the invoice quietly grows.
- Fine-tuning a custom model carries a training charge per 1M tokens before you pay a cent for inference.
- Classify fine-tuning tasks bill separately, so a classification-heavy pipeline has its own line item.
Red Flags
Users have raised data-handling concerns on default tiers, and there is recurring confusion about how search-unit allocations translate into the bill.
"Collects ALL data you send them by default"
"The inference API is billed on a pay-as-you-go basis... US$1.53 for 1,000 search units"
Based on analysis of recent Reddit and G2 discussions.
Green Wins
The free trial exercises every endpoint, so you can prove the Embed and Rerank quality before paying anything.
- Command A carries enterprise RAG and tool use over your own data
- State-of-the-art multilingual embeddings across 100+ languages
- A dedicated Rerank endpoint that measurably lifts search relevance
"In most cases, users compare alternatives based on reliability and ease of use"
G2
"The best Chat | Cohere alternatives are ClickUp, Notion, and Apollo"
G2
"Cohere is fast, returns consistently with results (even sometimes not good results)"
User review
User Voices
"Cohere is easy, fast, cheap and pretty good"
"Is the Cohere Reranker basically a requirement for RAG at this point?"
"Lower price of this model compared to Mistral Large"
Verdict
Teams building RAG pipelines should run Command R7B for low-latency work and pair it with Rerank to sharpen relevance affordably. Enterprises that need deep reasoning over their own data should deploy Command A, and anyone under compliance rules should price the on-prem Enterprise tier. If you would rather have a standardized flat-rate conversational model instead of a metered search stack, consider ChatGPT at $20/mo.
Cohere price history
Frequently asked questions
How does Cohere pricing compare?
See how Cohere's 5 pricing plans stack up against similar Large Language Models tools.
Research Reports
Sources & Data Trail · Cohere
- 1.Official Pricing Page·Source of verified tiers(Checked: 2026-07-03)
- 2.Official Website·Official vendor website
- 3.G2·G2 verified user reviews · 4.5/5 · 6 reviews
- 4.Capterra·Capterra verified user reviews · 4.4/5
- 5.TrustRadius·TrustRadius verified reviews
- 6.PeerSpot·PeerSpot enterprise peer reviews

