System vs User vs Assistant Roles
Chat APIs aren't just messages — they have roles that change behaviour.
When you call a chat completion API, you don't just send a string. You send an array of messages, each with a role: system, user, or assistant. The roles matter — they're not labels, they're behavioural switches.
The three roles
[
{ "role": "system", "content": "You are a terse assistant. Reply in under 30 words." },
{ "role": "user", "content": "Explain RAG." },
{ "role": "assistant", "content": "RAG retrieves relevant documents, then asks the LLM to answer using them as context." },
{ "role": "user", "content": "Give an example." }
]
- system — instructions the model treats as persistent rules. Style, persona, constraints, safety policies.
- user — the human's input.
- assistant — past model responses, included to maintain conversation history.
Why the system role is special
System messages have higher precedence than user messages in the model's instruction hierarchy. If a user message says "ignore previous instructions," a well-aligned model will defer to the system message.
This matters for production: put your core rules in system, never in user. A user can override anything in their own message; they can't easily override a system message.
Multi-turn conversations
To maintain a conversation, you must resend the entire message history every turn. The model is stateless — it has no memory of previous calls. Every request includes everything that's been said so far.
This has cost implications. A 50-turn conversation re-sends 49 turns of context every time. Use prompt caching (Level 4) to amortise this.
Common patterns
- One-shot tasks —
[system, user]only. No history needed. - Chatbots —
[system, user1, assistant1, user2, assistant2, ...]— full rolling history. - Agents —
[system, user, assistant(tool_call), tool(result), assistant, ...]— interleaved tool calls (Level 3).
The system prompt as a contract
A good system prompt is a tight contract:
- Identity — "You are a customer support assistant for Acme Corp."
- Capabilities — "You can answer questions about our products, pricing, and shipping."
- Constraints — "Do not discuss competitors. Do not promise discounts."
- Tone — "Friendly, professional, concise. Maximum 3 sentences per reply."
- Escalation rules — "If the user asks about a refund, respond with: 'Let me connect you to a human agent.'"
These five blocks cover 90% of production system prompts. Most production bugs come from missing one of them.