TokenTally
← Back to the calculator

Comparison

Compare models against the exact same workload

These numbers use the same inputs you entered on the main calculator. Bookmark or share this URL to revisit the exact scenario later.

Use this table when you need to justify provider choices, show the savings from switching, or hand finance a shortlist of realistic model options.

Active model

GPT-4.1

OpenAI

400 prompt tokens + 250 completion tokens, 5,000 requests / month (488 words/request)

Per request

$0.027

Per month

$135.00

Per year

$1,620.00

Prompt tokens / month

2,000,000

Model rate card

Input: $30.00 / 1M tokens

Output: $60.00 / 1M tokens

Scenario breakdown

Workload mix: input-heavy

Input cost: $60.00/month

Output cost: $75.00/month

How to read this

Input-heavy workloads care more about prompt pricing. Output-heavy workloads care more about completion pricing.

Decision support

Qwen3.5-Flash (Global) is the lowest-cost option for this scenario

Switching from GPT-4.1 to Qwen3.5-Flash (Global) would save about $134.5833/month, or $1,614.999/year.

Model comparison

Cost to run this workload

Prices in $ (USD)
Model$/request$/monthInput / OutputSavings vs selectedProvider
GPT-5.4 Pro$0.057$285.00
In $30.00
Out $180.00
Higher costOpenAI
GPT-4.1$0.027$135.00
In $30.00
Out $60.00
SelectedOpenAI
Claude Opus 4.6$0.0083$41.25
In $5.00
Out $25.00
Save $93.75/moAnthropic
GPT-4.1 Mini$0.0058$28.75
In $5.00
Out $15.00
Save $106.25/moOpenAI
GPT-4o$0.0058$28.75
In $5.00
Out $15.00
Save $106.25/moOpenAI
Mistral Large 2$0.0056$28.00
In $4.00
Out $16.00
Save $107.00/moMistral
Claude 3.7 Sonnet$0.0049$24.75
In $3.00
Out $15.00
Save $110.25/moAnthropic
Claude Sonnet 4.6$0.0049$24.75
In $3.00
Out $15.00
Save $110.25/moAnthropic
Sonar Pro$0.0049$24.75
In $3.00
Out $15.00
Save $110.25/moPerplexity
Grok 3$0.0049$24.75
In $3.00
Out $15.00
Save $110.25/moxAI
Grok 4 (0709)$0.0049$24.75
In $3.00
Out $15.00
Save $110.25/moxAI
GPT-5.4$0.0048$23.75
In $2.50
Out $15.00
Save $111.25/moOpenAI
GPT-5.2$0.0042$21.00
In $1.75
Out $14.00
Save $114.00/moOpenAI
Gemini 1.5 Pro$0.004$20.125
In $3.50
Out $10.50
Save $114.875/moGoogle
Gemini 3.1 Pro Preview$0.0038$19.00
In $2.00
Out $12.00
Save $116.00/moGoogle
Gemini 2.5 Pro$0.003$15.00
In $1.25
Out $10.00
Save $120.00/moGoogle
GPT-5$0.003$15.00
In $1.25
Out $10.00
Save $120.00/moOpenAI
GPT-5.1$0.003$15.00
In $1.25
Out $10.00
Save $120.00/moOpenAI
Mistral Small 3$0.0023$11.50
In $2.00
Out $6.00
Save $123.50/moMistral
Grok 4.20 Beta (Reasoning)$0.0023$11.50
In $2.00
Out $6.00
Save $123.50/moxAI
Claude Haiku 4.5$0.0017$8.25
In $1.00
Out $5.00
Save $126.75/moAnthropic
Gemini 3 Flash Preview$0.001$4.75
In $0.50
Out $3.00
Save $130.25/moGoogle
DeepSeek Reasoner V3.2$0.0008$3.8375
In $0.55
Out $2.19
Save $131.1625/moDeepSeek
Gemini 2.5 Flash$0.0007$3.725
In $0.30
Out $2.50
Save $131.275/moGoogle
Sonar$0.0007$3.25
In $1.00
Out $1.00
Save $131.75/moPerplexity
Llama 3.1 70B Instruct (Fireworks)$0.0006$3.10
In $0.80
Out $1.20
Save $131.90/moMeta
GPT-5 Mini$0.0006$3.00
In $0.25
Out $2.00
Save $132.00/moOpenAI
Qwen3-Max (Global)$0.0005$2.5105
In $0.359
Out $1.434
Save $132.4895/moAlibaba Cloud
Gemini 3.1 Flash-Lite Preview$0.0005$2.375
In $0.25
Out $1.50
Save $132.625/moGoogle
Grok Code Fast 1$0.0005$2.275
In $0.20
Out $1.50
Save $132.725/moxAI
MiniMax M2$0.0004$2.10
In $0.30
Out $1.20
Save $132.90/moMiniMax
MiniMax M2.1$0.0004$2.10
In $0.30
Out $1.20
Save $132.90/moMiniMax
Claude 3.7 Haiku$0.0004$2.0625
In $0.25
Out $1.25
Save $132.9375/moAnthropic
Gemini 1.5 Flash$0.0004$2.0125
In $0.35
Out $1.05
Save $132.9875/moGoogle
DeepSeek Chat V3.2$0.0004$1.915
In $0.27
Out $1.10
Save $133.085/moDeepSeek
Grok 3 Mini$0.0002$1.225
In $0.30
Out $0.50
Save $133.775/moxAI
Qwen3.5-Plus (Global)$0.0002$1.09
In $0.115
Out $0.688
Save $133.91/moAlibaba Cloud
GPT-4o Mini$0.0002$1.05
In $0.15
Out $0.60
Save $133.95/moOpenAI
Grok 4.1 Fast$0.0002$1.025
In $0.20
Out $0.50
Save $133.975/moxAI
Gemini 2.5 Flash-Lite$0.0001$0.70
In $0.10
Out $0.40
Save $134.30/moGoogle
GPT-5 Nano$0.0001$0.60
In $0.05
Out $0.40
Save $134.40/moOpenAI
Llama 3.1 8B Instruct (Fireworks)$0.0001$0.585
In $0.18
Out $0.18
Save $134.415/moMeta
Qwen3.5-Flash (Global)$0.0001$0.4167
In $0.029
Out $0.287
Save $134.5833/moAlibaba Cloud

How to use this view

Use it to justify a model choice, not just browse prices

  • Latency class hints at user experience. “Priority” tiers (Opus, GPT-5.4 Pro) come with higher SLAs than “economy” tiers (GPT-4o mini, Qwen Flash).
  • Context windows matter. If your prompts regularly exceed 200K tokens, pick models with 1M context (Claude 4.6, Gemini 2.5/3.x, Qwen3-Max) to avoid silent truncation.
  • Cache pricing can drop input cost by 5–10×. Anthropic, OpenAI, and Alibaba list cache-read rates in each guide—factor them in before committing.

Decision checks

Click any model name to re-center the scenario around it. That makes it easy to compare a premium pick against a cheaper fallback and carry the numbers into a planning doc.

Need qualitative context? Each row links back to a long-form guide where we discuss latency expectations, tooling support, and policy gotchas.

For compliance reviews, keep the methodology page nearby—it documents every pricing citation and the date we last verified it.

Need to tweak the inputs?

Head back to the calculator, adjust the sliders, and reopen this view.

Sharing this comparison with finance or leadership? Pair it with the appropriate deep-dive guideso reviewers see assumptions, cache math, and citations alongside the raw totals.