Tokens, Cost, and Budget Anxiety
Token math is the most useful spreadsheet skill in AI. Five minutes here saves five-figure surprises later.
The formula
monthly_cost = (avg_input_tokens × in_price + avg_output_tokens × out_price)
× requests_per_user × users × 30
/ 1,000,000
A realistic example
- 1,000 daily users
- 5 requests/user/day
- 800 input + 400 output tokens per request
- Model: Llama 3.3 70B on Groq (
$0 / $0— free tier) → $0 - Same workload on GPT-4o (
$2.50 / $10) → $510/month
That's the open-source-first thesis in one calculation.
Tactics to reduce cost
- Use the smallest model that passes your evals — not the biggest you can afford.
- Cache aggressively. Anthropic and Google both support prompt caching now.
- Shorten outputs. "Reply in 50 words" is real money.
- Truncate context. Don't pass the entire conversation forever — summarise older turns.
Try the Cost Calculator in LLMAtlas with your real numbers — it does this math live across 20+ models.