xAI frontier models
Grok 4.1 Fast pricing
This tier brings Grok’s 2M context to a $0.20/M input price point. Use these scenarios to see how far it stretches.
Last pricing check: Mar 13, 2026
Why teams choose this model
Scenario planning
Realistic cost examples
Numbers use Grok 4.1 Fast pricing
Realtime customer desk
Agents lean on Grok 4.1 Fast for up-to-date answers blended with knowledge base snippets.
Per request
$0.0003
Per month
$16.32
Tokens sent
51,360,000
Executive dashboard buddy
Leaders ask Grok for summaries across BI tools, and responses stay under $0.01 each with cache hits.
Per request
$0.0004
Per month
$3.66
Tokens sent
11,900,000
Batch content cleanup
Marketing ops cleans subject lines + snippets at scale using cached prompts to hit the $0.05/M tier.
Per request
$0.0002
Per month
$14.40
Tokens sent
46,800,000
Compare with
FAQs
What makes 4.1 Fast cheaper than 4.20?
It’s tuned for throughput rather than bleeding-edge reasoning. You still get 2M context, but the per-token rate drops to $0.20/$0.50.
How do I ensure cache hits?
Keep system prompts and shared prefixes identical between calls. TokenTally’s tokenizer shows which portions stay fixed so you can maximize reuse.
Does Grok 4.1 Fast support non-reasoning mode?
Yes, but pricing is the same. Stick with the reasoning SKU unless you specifically need to disable CoT overhead.
Pricing sources
- https://docs.x.ai/developers/models
Checked Mar 13, 2026