Back to Learn
Foundations
Beginner
7 min read

Tokens, Cost, and Budget Anxiety

Demystify pricing. Build a back-of-napkin cost model in 5 minutes.

Tokens, Cost, and Budget Anxiety

Token math is the most useful spreadsheet skill in AI. Five minutes here saves five-figure surprises later.

The formula

monthly_cost = (avg_input_tokens × in_price + avg_output_tokens × out_price)
               × requests_per_user × users × 30
               / 1,000,000

A realistic example

  • 1,000 daily users
  • 5 requests/user/day
  • 800 input + 400 output tokens per request
  • Model: Llama 3.3 70B on Groq ($0 / $0 — free tier) → $0
  • Same workload on GPT-4o ($2.50 / $10) → $510/month

That's the open-source-first thesis in one calculation.

Tactics to reduce cost

  1. Use the smallest model that passes your evals — not the biggest you can afford.
  2. Cache aggressively. Anthropic and Google both support prompt caching now.
  3. Shorten outputs. "Reply in 50 words" is real money.
  4. Truncate context. Don't pass the entire conversation forever — summarise older turns.

Try the Cost Calculator in LLMAtlas with your real numbers — it does this math live across 20+ models.

LLMAtlas — The Open Ecosystem Workspace for LLMs