Extended Thinking and Prompt Injection Defense
Explore how to enhance Claude AI's reasoning with extended thinking and safeguard your AI systems from prompt injection attacks. This lesson helps you implement adaptive effort levels, identify injection types, and apply three key defenses including XML tagging, tool result validation, and code-level enforcement.
We'll cover the following...
- Extended thinking: How it works
- Sizing the thinking effort
- When extended thinking does not help
- Prompt injection: What it is
- Defense 1: Wrap user content in XML tags
- Defense 2: Label and validate the tool result content
- Defense 3: Rely on code-level enforcement, not Claude's judgment
- Complete code
- Exercise: Identify the injection and the defense
- What’s next?
This lesson covers two topics that are each tested heavily on the exam and that interact in practice. Extended thinking gives Claude a larger internal reasoning space for complex tasks; prompt injection is an attack that tries to exploit that reasoning space (or the system prompt) to redirect Claude’s behavior. Understanding both together helps you build agents that reason more accurately and are harder to manipulate. By the end of this lesson, we will be able to:
Enable adaptive thinking and choose the right
effortlevel for a given task complexityDistinguish tasks where extended thinking helps from tasks where it adds cost without benefit
Identify the two categories of prompt injection: direct and indirect
Apply three prompt-level defenses that reduce the injection attack surface
Extended thinking: How it works
When extended thinking is enabled, Claude performs a separate internal reasoning phase before generating the visible response. This reasoning is not constrained by the output format instructions; Claude uses it to work through the problem in natural language before committing to an answer.
The internal reasoning is returned in thinking blocks in response.content. These blocks appear before the text block that contains the actual response:
response = client.messages.create(model="claude-opus-4-8",max_tokens=2048,thinking={"type": "adaptive"},output_config={"effort": "high"},system=SYSTEM,messages=messages)for block in response.content:if block.type == "thinking":internal_reasoning = block.thinkingelif block.type == "text":visible_response = block.text
thinking={"type": "adaptive"}: Enables adaptive thinking. Claude decides on its own whether to reason internally before responding. This is the Claude 4.x API; Claude 3.7 usedthinking={"type": "enabled", "budget_tokens": N}with an explicit token cap instead.output_config={"effort": "high"}: Controls how deeply Claude reasons. Values are"low","medium", and"high". Higher effort produces more thorough internal reasoning, with higher cost and latency. Omitoutput_configto let the model ...