⌘K

Foundations

Beginner

7 min read

Tokens, Cost, and Budget Anxiety

Demystify pricing. Build a back-of-napkin cost model in 5 minutes.

Tokens, Cost, and Budget Anxiety

Token math is the most useful spreadsheet skill in AI. Five minutes here saves five-figure surprises later.

The formula

monthly_cost = (avg_input_tokens × in_price + avg_output_tokens × out_price)
               × requests_per_user × users × 30
               / 1,000,000

A realistic example

1,000 daily users
5 requests/user/day
800 input + 400 output tokens per request
Model: Llama 3.3 70B on Groq ($0 / $0 — free tier) → $0
Same workload on GPT-4o ($2.50 / $10) → $510/month

That's the open-source-first thesis in one calculation.

Tactics to reduce cost

Use the smallest model that passes your evals — not the biggest you can afford.
Cache aggressively. Anthropic and Google both support prompt caching now.
Shorten outputs. "Reply in 50 words" is real money.
Truncate context. Don't pass the entire conversation forever — summarise older turns.

Try the Cost Calculator in LLMAtlas with your real numbers — it does this math live across 20+ models.

Next: The Open-Source AI Stack

Try it in the Playground

Apply what you learned with a free model.

Browse all lessons

9 lessons across 5 categories.

LLMAtlas — The Open Ecosystem Workspace for LLMs