Advanced Reasoning Techniques
Learn how to use advanced reasoning prompting techniques to guide an AI through complex problem-solving.
LLMs are trained to predict the next token, a process that enables them to generate text efficiently and fluently. For many tasks, this behavior is an advantage. For problems that require multi-step reasoning, it can become a limitation because the model may generate a plausible but incorrect conclusion. Consider the following classic riddle:
Riddle: A bat and a ball cost $1.10 in total. The bat costs $1.00 more than the ball. How much does the ball cost? |
A typical fast, intuitive response from the model is $0.10. This seems correct at first glance, but it is incorrect. If the ball cost $0.10, the bat would cost $1.10, and the total would be $1.20. This simple problem highlights an important point: for complex reasoning, we need to prompt the model to work through the steps explicitly. Next, we will look at how to do that in practice.
Chain-of-thought (CoT) prompting
The single most important breakthrough in LLM reasoning is a technique called chain-of-thought (CoT) prompting. It is a foundational technique that many advanced methods extend or complement.
Chain-of-thought is a prompting technique that encourages an LLM to generate a series of intermediate, coherent reasoning steps that lead to the final answer to a multi-step problem. Instead of producing a final answer directly, the model first externalizes its thinking process. This mimics the human approach of breaking down a complex problem into smaller, more manageable parts, which significantly reduces the likelihood of reasoning errors.
This technique is so effective because it aligns the model’s generation process with a logical reasoning path. Each step in the chain provides context for the next, guiding the model toward a correct conclusion. There are two primary ways to elicit this behavior from a model:
Zero-shot CoT
Few-shot CoT
Zero-shot CoT
The simplest way to implement CoT is through a zero-shot approach. This method relies on the fact that modern, large-scale models have been trained on vast amounts of text that includes step-by-step problem-solving. As a result, they have a latent, built-in capability for reasoning that can be unlocked with a simple trigger phrase.
The technique is straightforward: after stating the problem, we add a phrase that prompts the model to show its reasoning, such as “Let’s think step by step.”
Let’s apply this directly to our introductory problem.
Prompt: A bat and a ball cost $1.10 in total. The bat costs $1.00 more than the ball. How much does the ball cost? Let's think step by step. |
Output: 1. Let B be the cost of the bat and C be the cost of the ball. 2. The problem gives us two pieces of information that we can turn into equations: a) The total cost is $1.10, so: B + C = 1.10 b) The bat costs $1.00 more than the ball, so: B = C + 1.00 3. Now I can solve this system of equations. I will substitute the second equation into the first one. 4. This gives me: (C + 1.00) + C = 1.10 5. Combining the terms with C, I get: 2C + 1.00 = 1.10 6. Subtracting 1.00 from both sides gives: 2C = 0.10 7. Finally, dividing by 2 gives: C = 0.05 Therefore, the ball costs $0.05. |
By adding that one simple phrase, we transformed the prompt from a simple question into a request for a reasoned process, guiding the model away from its incorrect intuitive guess and toward the correct answer.
Few-shot CoT
While zero-shot CoT is effective, it depends on the model having learned how to break down that type of problem during training. For novel or more complex tasks, few-shot CoT provides a more reliable way to guide the model’s reasoning.
In this technique, we provide one or more examples within the prompt that demonstrate the desired reasoning pattern. This gives the model a clear blueprint to follow. It ...