The Modern Model Zoo
GPT, Claude, Llama, Gemini, DeepSeek, Mistral, Qwen — who's who in 2026.
The LLM landscape splits roughly into three camps. Knowing which model is which — and what it's optimised for — saves you money and shipping headaches.
The closed frontier (proprietary)
These labs control their weights and serve via API only:
- OpenAI — GPT-4.1, GPT-4o, o3-mini, o3. Generalist powerhouse, frontier reasoning, multimodal (text + image + audio + video). Premium price.
- Anthropic — Claude 4 Opus, Claude 4 Sonnet, Claude 4.5 Haiku. Strongest on coding and long-context reasoning. Constitutional AI alignment.
- Google DeepMind — Gemini 2.5 Pro, Gemini 2.5 Flash. Largest native context (1M+ tokens), excellent multimodal, generous free tier via AI Studio.
- xAI — Grok 3, Grok 4. Real-time X data, looser content policies.
The open-weight champions
Weights are downloadable. You can self-host, fine-tune, or just use the cheap API:
- Meta Llama — Llama 4 Scout (16-expert MoE, 512K context), Llama 4 Maverick (128-expert MoE, 1M context). The default open foundation model.
- DeepSeek — DeepSeek V3, DeepSeek R1 (reasoning). Chinese lab, frontier-quality open weights, extraordinary value.
- Mistral — Mistral Large 2, Mistral Small 3.1, Codestral. Strong European alternative, strong code, multilingual.
- Alibaba Qwen — Qwen3 235B, Qwen3 32B, QwQ. Top multilingual + reasoning open models.
- Google Gemma — Gemma 3 27B, 12B, 4B. Open-weight Gemini cousins.
- Microsoft Phi — Phi-4. Tiny models trained on heavily curated synthetic data.
The reasoning specialists
A new category that emerged in 2025: models trained with extensive RL to think before answering, producing visible chain-of-thought:
- OpenAI o3, DeepSeek R1, Qwen QwQ, Google Gemini Thinking
These are slower and more expensive per query, but blow past standard models on math, code, and multi-step problems.
How to choose
A rough heuristic:
| Need | Default choice |
|---|---|
| Cheap general chat | Llama 3.3 70B or Gemini 2.5 Flash (free tier) |
| Best code | Claude 4 Sonnet or Qwen 2.5 Coder 32B (free) |
| Hard reasoning | DeepSeek R1 (free) or o3 (paid) |
| Long documents | Gemini 2.5 Pro (1M ctx) or Llama 4 Maverick |
| Self-hosted | Llama 4 Scout or DeepSeek V3 |
| Vision | Claude 4, Gemini 2.5, or Llama 3.2 Vision |
You'll go deeper on model selection in Level 4. For now: there's no single "best" model. Match the model to the job.