★★★★★ 4.6 CE

Cohere Pricing: Plans & Features 2026

A free trial then usage-based rates across Command, Embed, and Rerank. The model and endpoint you call set the bill, not any plan name.

Large Language ModelsUsage-BasedPrice verified yesterdayFrom $0.04/mo Free plan ✓

Full review Alternatives Compare Official site ↗

Plans & Tiers Quick Answer Calculator API Cost Real Cost Expert Take History FAQ Sources

Cohere plans and pricing

High· Verified July 3, 2026

free trial

Command R7B

$0.04/per 1M input tokens

✓Cheapest production API
✓128K token context window
✓Output tokens $0.15/1M

View on vendor site

Command R

$0.15/per 1M input tokens

✓RAG-optimized mid-tier
✓128K token context window
✓Strong long-document recall
✓Output tokens $0.60/1M

View on vendor site

Command A

$2.5/per 1M input tokens

✓Flagship model
✓128K token context window
✓Tuned for long-document RAG
✓Output tokens $10.00/1M

View on vendor site

Embed v3

$0.1/per 1M input tokens

✓English embedding
✓Multilingual embedding (100+ languages)

View on vendor site

Rerank v3

$2/per 1M tokens

✓Unique neural reranker
✓Boosts vector-search precision

View on vendor site

Cohere pricing: the quick answer

Quick answerLast verified: July 3, 2026High

Cohere is pay-as-you-go across its catalog rather than a flat subscription, with a free trial key to start. Command A is the flagship, Command R7B the cheap low-latency model at fractions of a cent per 1M input tokens, and Embed and Rerank each bill on their own scale, Rerank by search unit rather than by token. A production key bills at month end or once you cross an outstanding threshold. The plan name barely matters here; the model and endpoint you call decide the invoice.

Command R7B$0.04/per 1M input tokens
Embed v3$0.10/per 1M input tokens
Command R$0.15/per 1M input tokens
Rerank v3$2/per 1M tokens
Command A$2.50/per 1M input tokens

Use the interactive Cohere pricing calculator to estimate your exact monthly cost at your team size, with annual-billing savings and the hidden costs counted in.

Free tier

Yes

Billing model

Usage-Based

Cheapest paid

$0.04/per 1M input tokens

Annual discount

Not offered

At $0.04/mo to start, Cohere sits 100% below the $9/mo median across 10 large language models tools we track.

Cohere cost calculator

What Cohere really costs

What sits on top of the plan fee

Cohere's real cost lives in the rate table, not a subscription fee. Which model and endpoint you call decides the bill far more than any plan name.

Rerank search units

Docs over 500 tokens split into chunks, each chunk bills as 1 doc

$2.00-$2.50 per 1,000 searches

Model Vault dedicated deployment

Hourly or monthly managed inference, by model size

$4-$10/hr ($2,500-$6,500/mo)

Embed 4 images

Multimodal embedding, image input

$0.47 per 1M tokens

Billing trigger

Production API key bills at month end or at $250 outstanding

rate not published beyond threshold

Pricing Expert Take

Independent analysis · Cohere

Verified Pricing Data

Value Analysis

$0.04/mo100% below median

Category median: $9/moBased on 10 products in this category

Cohere skips the flat monthly seat and charges you per model and endpoint, which is the right structure for search-heavy work and the wrong one to eyeball. Command R7B is aggressively cheap for high-volume, low-latency calls, while the flagship Command A carries the demanding reasoning at a higher per-token rate. The lever that decides most bills is not the generate step at all, it is Rerank, priced by search unit, so a pipeline that reranks everything spends very differently from one that reranks selectively. Paired thoughtfully, R7B for volume and Command A for the hard calls, the pricing is efficient for RAG.

Hidden Costs

Rerank scales by search unit and multilingual rerank models can run higher per 1,000 units, so heavy reranking is where the invoice quietly grows.
Fine-tuning a custom model carries a training charge per 1M tokens before you pay a cent for inference.
Classify fine-tuning tasks bill separately, so a classification-heavy pipeline has its own line item.

Red Flags

Users have raised data-handling concerns on default tiers, and there is recurring confusion about how search-unit allocations translate into the bill.

"Collects ALL data you send them by default"
Reddit

"The inference API is billed on a pay-as-you-go basis... US$1.53 for 1,000 search units"
Reddit

Based on analysis of recent Reddit and G2 discussions.

Green Wins

The free trial exercises every endpoint, so you can prove the Embed and Rerank quality before paying anything.

Command A carries enterprise RAG and tool use over your own data
State-of-the-art multilingual embeddings across 100+ languages
A dedicated Rerank endpoint that measurably lifts search relevance

"In most cases, users compare alternatives based on reliability and ease of use"
G2

"The best Chat | Cohere alternatives are ClickUp, Notion, and Apollo"
G2

"Cohere is fast, returns consistently with results (even sometimes not good results)"
User review

User Voices

"Cohere is easy, fast, cheap and pretty good"
Reddit

"Is the Cohere Reranker basically a requirement for RAG at this point?"
Reddit

"Lower price of this model compared to Mistral Large"
Reddit

Verdict

Teams building RAG pipelines should run Command R7B for low-latency work and pair it with Rerank to sharpen relevance affordably. Enterprises that need deep reasoning over their own data should deploy Command A, and anyone under compliance rules should price the on-prem Enterprise tier. If you would rather have a standardized flat-rate conversational model instead of a metered search stack, consider ChatGPT at $20/mo.

ComparEdge EditorialUpdated: July 3, 2026