TokenTally · prompt cost intelligence
Budget every AI feature before the bill arrives.
TokenTally ingests live pricing from OpenAI, Anthropic, Google, Meta, xAI, Alibaba, Perplexity, MiniMax, and more. We document the math, publish transparent scenarios, and keep finance + engineering on the same spreadsheet-free page.
Dataset refresh
Mar 13 2026
Models tracked
48 providers
Scenario templates
1,200+
Guides published
30+
Baseline GPT‑4.1 workload (auditable + editable)
A typical advanced assistant request uses 400 prompt tokens + 250 completion tokens. At 5,000 monthly requests, that’s 2M prompt tokens and 1.25M completion tokens. TokenTally keeps the math visible so finance, PM, and engineering can all gut-check it.
Per request
$0.027
Per month
$135
Annualized
$1,620
Total tokens / mo
16.3M
Need your own numbers? Paste real prompts into the calculator, toggle cache-hit assumptions, and share a permalink with your team.
Support & success
Tier-1 chat, CRM copilots, deflection bots, multilingual queues, and cache-aware macros.
Analytics & ops
KPI copilots, anomaly explainers, notebooks, and multi-agent ETL orchestrators.
Creative & RAG
Research assistants, localization sweeps, marketing briefers, and doc-heavy RAG flows.
What changed this month
- OpenAI GPT‑5.4 tiers. Added long-context surcharges (2× input / 1.5× output once you cross 272K tokens) with side-by-side comparisons to GPT‑4.1 and GPT‑4o Mini.
- Anthropic Claude 4.6 family. New cache pricing, 1M-token beta notes, and scenario presets for Sonnet vs Opus tradeoffs.
- Google Gemini 3.x previews. Documented Search/Maps grounding fees and audio token uplifts so you can defend forecasted spend.
How we verify
Every refresh is logged in the methodology page with citations to provider docs, console screenshots, or billing emails. We store evidence before UI changes ship so auditors can retrace the math.
TokenTally never stores pasted prompts from the calculator; token counts are computed in-browser to keep sensitive context private.
Deep dives worth bookmarking
Break down compliance copilots, code refactor agents, and analyst loops with transparent per-request math and cache-hit scenarios.
Claude 3.7 Sonnet token costs Legal + strategy copilotsModel long-form briefs, policy reviews, and retrieval-heavy flows while staying under Anthropic’s long-context surcharges.
Gemini 2.5 Flash deployment playbook 1M-context assistants with Search groundingUnderstand grounding fees, TPM caps, and when to upgrade to Pro vs. keep Flash for notification digests and analytics bots.
Who we are
TokenTally is an operator-led project documenting real LLM budgets. Read the About page for the roadmap and team values.
Policies
Privacy, Terms, and financial disclaimers live on dedicated pages (Privacy, Disclaimer). We spell out how data is handled before ads run.
Contact
Questions, partnerships, or pricing corrections? Email hello@tokentally.net or use the form on the Contact page.
Share or bookmark your scenario
Every calculator state is linkable: https://tokentally.net/?model=openai:gpt-4.1&prompt=400&completion=250&rpm=5000.
Send it to finance for approvals or embed it in launch docs so costs are never a surprise.