Debugging and Observability

Explore how to diagnose and resolve common failures in LangGraph workflows by using state inspection, checkpoint history, and LangSmith observability tools. Understand how to systematically track execution paths, distinguish failure categories, and apply effective debugging strategies to build reliable and maintainable AI agents with graph orchestration.

We'll cover the following...

The four categories of graph failure
Three debugging tools
What we are building
State design
Build it step by step
Debugging approach 1: state inspection
Debugging approach 2: checkpoint history
Debugging approach 3: LangSmith
Fixing the bug
A systematic debugging checklist
Complete executable code
Exercise
Solution
Terms introduced in this lesson

In a traditional Python function, a failure produces a traceback that points to the exact line. You fix the line and move on. In a graph workflow, the failure is rarely that direct.

A routing function sends execution to the wrong branch, but the error shows up two nodes later when a handler tries to read a field that was never written. A model returns an action label with a capitalised first letter instead of lowercase, but the failure appears as an unrecognised node name in a routing function. A node silently writes an empty string to a field, and the downstream quality check passes incorrectly because the check was only verifying that the field existed, not that it had meaningful content.

The challenge is not that LangGraph hides failures. It is that the failure’s cause and the failure’s symptom are often separated by one or more nodes. To debug effectively, we need to see the execution path the graph actually took, not just the final state.

The four categories of graph failure

Most problems in LangGraph workflows fall into one of four categories. Knowing which category a failure belongs to narrows the search considerably. The table below outlines the four categories of graph workflow failure, describing what can go wrong in each category and how those failures typically appear during execution.

Category	What fails	How it presents
Routing failure	A routing function returns an unexpected or wrong node name	`Invalid return value` error, or the wrong handler runs silently
State failure	A node reads a field that was never written, or reads a stale value	`KeyError`, `TypeError`, or logically wrong output from a downstream node
Model output failure	The model returns content in an unexpected format or with unexpected values	JSON parse error, failed validation, or routing to the fallback path more often than expected
Tool failure	A tool node receives bad input or the external call fails	Exception from the tool function, empty `tool_result`, or a silent default that propagates downstream

Identifying the category first tells us where to look. Routing failures live in the routing function and the classifier node. State failures live in the state schema and the node that was supposed to write the missing field. Model output failures live in the extraction node and the prompt. Tool failures live in the tool function and the node that calls it.

Three debugging tools

We have three debugging tools available in LangGraph, in order from least setup to most:

State inspection requires nothing beyond reading the dict returned by app.invoke. It tells us the final state of every field. It does not tell us which nodes ran in which order, or what each field’s value was at intermediate steps.
Checkpoint history requires the graph to be compiled with a checkpointer. Calling app.get_state_history(config) returns every saved snapshot in reverse chronological order — one per node execution. Each snapshot shows the complete state after that node ran. This is the most powerful built-in debugging tool.
LangSmith is an external observability platform for LangChain and LangGraph workflows. Once configured with two environment variables, every invocation is automatically traced. LangSmith shows each node as a timed step, displays inputs and outputs for every step, surfaces errors with full context, and lets us compare runs side by side. It requires a LangSmith account, but the setup is two lines.

What we are building

We will ...

1.Before Getting Started

2.From Chains to Graphs

3.Control Flow and Agent Patterns

4.Reliable Real-World Systems

5.Capstone Build, Guided Walkthrough

6.Course Wrap Up

Debugging and Observability

The four categories of graph failure

Three debugging tools

What we are building