Question 1

When do long-context rates kick in?

Accepted Answer

OpenAI charges double input / 1.5x output once a GPT-5.4 prompt exceeds 272K tokens. TokenTally keeps estimates under that line by default.

Question 2

Is cache pricing available?

Accepted Answer

$0.25/M tokens applies when the prompt prefix is cached via Responses/Assistants APIs. We expose that as the cache-hit number in the calculator.

Question 3

How do I treat reasoning tokens?

Accepted Answer

Reasoning tokens count toward output spend even though they don’t appear in responses. Budget with a headroom factor if you enable extra thinking time.

GPT-5.4 cost planner

Realistic cost examples