TokenTally

xAI frontier models

Grok 4.1 Fast pricing

This tier brings Grok’s 2M context to a $0.20/M input price point. Use these scenarios to see how far it stretches.

Last pricing check: Mar 13, 2026

$0.20 per 1M prompt tokens$0.50 per 1M completion tokens2,000,000 token context

Why teams choose this model

Support copilots with realtime search
Internal analytics assistants
Batch automation with prompt caching

Scenario planning

Realistic cost examples

Numbers use Grok 4.1 Fast pricing

Realtime customer desk

Agents lean on Grok 4.1 Fast for up-to-date answers blended with knowledge base snippets.

Per request

$0.0003

Per month

$16.32

Tokens sent

51,360,000

650 prompt tokens420 completion tokens48,000 requests/mo

Executive dashboard buddy

Leaders ask Grok for summaries across BI tools, and responses stay under $0.01 each with cache hits.

Per request

$0.0004

Per month

$3.66

Tokens sent

11,900,000

900 prompt tokens500 completion tokens8,500 requests/mo

Batch content cleanup

Marketing ops cleans subject lines + snippets at scale using cached prompts to hit the $0.05/M tier.

Per request

$0.0002

Per month

$14.40

Tokens sent

46,800,000

500 prompt tokens280 completion tokens60,000 requests/mo

Compare with

FAQs

What makes 4.1 Fast cheaper than 4.20?

It’s tuned for throughput rather than bleeding-edge reasoning. You still get 2M context, but the per-token rate drops to $0.20/$0.50.

How do I ensure cache hits?

Keep system prompts and shared prefixes identical between calls. TokenTally’s tokenizer shows which portions stay fixed so you can maximize reuse.

Does Grok 4.1 Fast support non-reasoning mode?

Yes, but pricing is the same. Stick with the reasoning SKU unless you specifically need to disable CoT overhead.

Pricing sources