Level 2 · Prompting & Interaction
8 min

Zero-shot, Few-shot & Chain-of-Thought

Three prompting modes, each with a sweet spot.

A prompt can include zero, one, or many examples. It can ask for a direct answer or a step-by-step reasoning trace. These choices have names — and big effects on output quality.

Zero-shot

You just ask. No examples, no reasoning scaffold:

Classify the sentiment: "The pizza was cold but the staff was lovely."

Zero-shot works when the task is common in training data (sentiment, summarisation, translation, simple Q&A). It fails when the task is unusual or has a specific output format the model hasn't seen.

Few-shot

You give 2–5 worked examples before the real one:

Classify the sentiment: "The pizza was burnt." → negative
Classify the sentiment: "Great service, fast delivery." → positive
Classify the sentiment: "Food was fine, nothing special." → neutral
Classify the sentiment: "The pizza was cold but the staff was lovely." →

Few-shot teaches by demonstration. It's the most reliable way to nail a specific output format. The examples also shape edge cases: showing a "neutral" example makes the model less prone to forcing every input into positive/negative.

Quality > quantity. Five diverse, high-quality examples outperform fifty mediocre ones. Cover edge cases. Avoid bias (don't make all examples positive, or all short).

Chain-of-thought (CoT)

Ask the model to show its work:

Q: A bat and a ball cost $1.10 together. The bat costs $1 more than the ball.
   How much does the ball cost? Think step by step.
A: Let the ball cost x. Then the bat costs x + 1.
   Total: x + (x + 1) = 1.10 → 2x = 0.10 → x = $0.05.

For non-reasoning models, CoT dramatically improves accuracy on math, logic, and multi-step problems. The intermediate tokens act as scratchpad — the model conditions each new step on its own previous reasoning.

For reasoning models (DeepSeek R1, o3, QwQ), CoT happens internally and you usually shouldn't ask for it explicitly. They've been RL-trained to do it; prompting "think step by step" can sometimes hurt.

Combining them

The most powerful pattern is few-shot + CoT: give 2–3 examples where each example shows the reasoning explicitly. The model learns both the format and the thinking style.

Q: ... A: First, ... Then, ... So the answer is ...
Q: ... A:

This combo is the workhorse of high-stakes prompting. It's how Anthropic, OpenAI, and Google demonstrate complex agent behaviours in research.

Diminishing returns

After about 5 examples, each new example helps less. After 10, you're often hurting yourself with token cost. If you need many examples, you probably want fine-tuning, not prompting.

Knowledge Check

Score 70% or higher to mark this chapter complete.

Q1.Why does chain-of-thought improve performance on math problems for non-reasoning models?

Q2.When does few-shot prompting help most?

Q3.For reasoning models like DeepSeek R1, what's the recommendation about CoT?

0 / 3 answered

LLMAtlas — The Open Ecosystem Workspace for LLMs