What is Chain-of-Thought (CoT)?

Fundamentals

Chain-of-Thought (CoT)

A prompting technique that encourages the model to show its reasoning step by step before arriving at a final answer, significantly improving performance on complex reasoning, math, and logic tasks.

Chain-of-thought prompting asks a model to "think out loud" — to explicitly write out intermediate reasoning steps rather than jumping directly to an answer. Instead of simply asking "What is 23 times 47?", a chain-of-thought prompt might say "Solve this step by step: What is 23 times 47?" The model then breaks the problem into parts (23 x 40 = 920, 23 x 7 = 161, 920 + 161 = 1081), making errors less likely and the answer more reliable.

This technique was formally introduced by Google researchers in 2022 and has become one of the most impactful prompting innovations. On complex reasoning benchmarks, chain-of-thought prompting can improve accuracy by 10-40% compared to direct prompting. The gains are especially dramatic for math word problems, multi-step logic, code debugging, and any task where intermediate reasoning is necessary. The improvement tends to be larger for more capable models — smaller models sometimes generate plausible-sounding but incorrect reasoning chains.

Modern frontier models have internalized chain-of-thought reasoning to varying degrees. OpenAI's o1 and o3 models explicitly use extended "thinking" before responding, producing responses that are more accurate on hard tasks. Anthropic's Claude models respond well to "think step by step" instructions. Some models are trained with chain-of-thought data, making them naturally more methodical even without explicit prompting.

Variations of chain-of-thought include tree-of-thought (exploring multiple reasoning paths and selecting the best), self-consistency (generating multiple chains and voting on the answer), and least-to-most prompting (decomposing complex problems into simpler subproblems). The tradeoff is that chain-of-thought increases output length and therefore cost and latency. For simple tasks, the overhead is not worth it, but for anything requiring multi-step reasoning, it is often essential for reliable performance.

Benchmarks

Context Window

Explore more AI concepts in the glossary

Browse Full Glossary