What is Chain-of-Thought Prompting?
Chain-of-thought (CoT) prompting is the technique of asking a language model to produce step-by-step reasoning before its final answer, rather than jumping straight to a conclusion. Models given room to "think out loud" reach correct answers on math, logic, and multi-step reasoning problems at much higher rates than models prompted to answer immediately.
How chain-of-thought works
The technique is structurally simple: append "Let's think step by step" or "Think through this carefully before answering" to a prompt, and the model produces intermediate reasoning before its conclusion. The 2022 Wei et al. paper from Google formalized it; modern models are RLHF-trained to use CoT by default on hard problems.
Variants:
- Zero-shot CoT — just ask for step-by-step reasoning, no examples
- Few-shot CoT — show the model worked examples, then ask the new question
- Self-consistency CoT — sample multiple reasoning chains, take the majority answer
- Tree-of-thoughts — explore multiple reasoning branches in parallel, prune low-quality ones
- Reasoning models — GPT-o1, Claude with Extended Thinking, DeepSeek-R1 do CoT internally before producing visible output
Why CoT matters
Reasoning-heavy benchmarks (GSM8K math, BIG-Bench Hard, etc.) saw 10-50 percentage-point improvements from CoT prompting at the time it was discovered. Modern reasoning models that generate long internal chains-of-thought before answering have produced jumps in capability that pure scaling alone hadn't.
Security implications
Chain-of-thought introduces specific attack surfaces that don't exist for direct-answer models:
- Reasoning-step manipulation. Adversarial prompts that target the intermediate reasoning, not just the final answer, can cause the model to convince itself of wrong conclusions through guided reasoning chains.
- Hidden-reasoning information leakage. Reasoning models that produce internal chain-of-thought before answering can leak sensitive information in the reasoning steps that gets stripped from the final response — but exists in logs and traces.
- Reasoning-content jailbreaks. Asking the model to "reason about" a harmful task often bypasses refusal training that targets direct requests. "Walk me through how someone might..." has higher breach rates than "tell me how to..."
- Token-budget exhaustion. Reasoning models with long chains can be coerced into burning their reasoning budget on attacker-chosen problems, denial-of-service style.
Practical takeaways
- For complex tasks, CoT improves quality but increases latency and cost (more tokens generated)
- Hide internal reasoning from end users when it might leak system prompts or sensitive context, but log it for audit
- When testing CoT-using deployments adversarially, probe both direct-answer and step-by-step phrasings — each surface refusal differently