Search⌘ K
AI Features

Debugging and Observability

Explore how to diagnose and resolve common failures in LangGraph workflows by using state inspection, checkpoint history, and LangSmith observability tools. Understand how to systematically track execution paths, distinguish failure categories, and apply effective debugging strategies to build reliable and maintainable AI agents with graph orchestration.

In a traditional Python function, a failure produces a traceback that points to the exact line. You fix the line and move on. In a graph workflow, the failure is rarely that direct.

A routing function sends execution to the wrong branch, but the error shows up two nodes later when a handler tries to read a field that was never written. A model returns an action label with a capitalised first letter instead of lowercase, but the failure appears as an unrecognised node name in a routing function. A node silently writes an empty string to a field, and the downstream quality check passes incorrectly because the check was only verifying that the field existed, not that it had meaningful content.

The challenge is not that LangGraph hides failures. It is that the failure’s cause and the failure’s symptom are often separated by one or more nodes. To debug effectively, we need to see the execution path the graph actually took, not just the final state.

The four categories of graph failure

Most problems in LangGraph workflows fall into one of four categories. Knowing which category a failure belongs to narrows the search considerably. The table below outlines the four categories of graph workflow failure, describing what can go wrong in each category and how those failures typically appear during execution.

Category

What fails

How it presents

Routing failure

A routing function returns an unexpected or wrong node name

Invalid return value error, or the wrong handler runs silently

State failure

A node reads a field that was never written, or reads a stale value

KeyError, TypeError, or logically wrong output from a downstream node

Model output failure

The model returns content in an unexpected format or with unexpected values

JSON parse error, failed validation, or routing to the fallback path more often than expected

Tool failure

A tool node receives bad input or the external call fails

Exception from the tool function, empty tool_result, or a silent default that propagates downstream

Identifying the category first tells us where to look. Routing failures live in the routing function and the classifier node. State failures live in the state schema and the node that was supposed to write the missing field. Model output failures live in the extraction node and the prompt. Tool failures live in the tool function and the node that calls it.

Three debugging tools

We have three debugging tools available in LangGraph, in order from least setup to most:

  • State inspection requires nothing beyond reading the dict returned by app.invoke. It tells us the final state of every field. It does not tell us which nodes ran in which order, or what each field’s value was at intermediate steps.

  • Checkpoint history requires the graph to be compiled with a checkpointer. Calling app.get_state_history(config) returns every saved snapshot in reverse chronological order — one per node execution. Each snapshot shows the complete state after that node ran. This is the most powerful built-in debugging tool.

  • LangSmith is an external observability platform for LangChain and LangGraph workflows. Once configured with two environment variables, every invocation is automatically traced. LangSmith shows each node as a timed step, displays inputs and outputs for every step, surfaces errors with full context, and lets us compare runs side by side. It requires a LangSmith account, but the setup is two lines.

What we are building

We will ...