Search⌘ K
AI Features

State Design for Scalable Workflows

Explore how to design and organize state in LangGraph workflows to handle complex data cleanly and clearly. This lesson teaches you to group state fields by role, assign ownership to nodes, and avoid common pitfalls, enabling scalable and easy-to-debug AI agent workflows.

In the first two lessons, state was small. A handful of fields, all obvious, all easy to track. But consider what happens when a real workflow needs to carry more information.

A document review assistant, for example, needs to know the original text, the document type, a word count, retrieved policy context, a generated summary, a quality verdict, a retry counter, and possibly an error message. That is eight fields before we have even thought carefully about what might come next.

If all eight land in one flat schema with abbreviated names like ctx, ok, and tries, the graph becomes hard to read. When something fails at runtime, there is no quick way to tell which node owns which field, whether ok means quality passed or approval passed, or what tries is counting. The state itself becomes the thing we have to debug.

This lesson is about preventing that problem before it starts. We will look at how to name fields clearly, group them by purpose, and assign each one to exactly one node. These habits keep state readable as the workflow grows, and they make bugs easier to find when they do appear.

What state actually is in LangGraph

State in LangGraph is a plain Python dictionary that conforms to a TypedDict schema. Every node in the graph reads from it and may return a partial update to it. LangGraph merges each node’s return value into the running state, so fields written by earlier nodes remain available to every later node unless overwritten.

The schema defines the contract. Every field that any node might read must be declared in the TypedDict before the graph runs. Fields cannot be added at runtime. This is not a limitation — it is the mechanism that makes state inspectable and predictable.

Because the schema is fixed and shared, the design choices we make up front ripple through the entire workflow.

The cost of unorganized state

Here is a state schema that works but creates problems:

from typing import TypedDict
class ReportStateV1(TypedDict):
text: str
doc_type: str
count: int
ctx: str
output: str
ok: bool
tries: int
V1 state schema with abbreviated, unorganized field names
  • Lines 3–10: Seven fields, all flat, all at the same level.

This schema is not technically wrong. The graph will run. But try answering these questions just by reading it: which node writes ctx? Does ok mean the summary passed quality review, or that the document was approved for publishing? Is count a word count, a retry count, or something else entirely?

None of those questions have answers in the schema itself. To find out, we have to read every node function, trace which fields each one touches, and build a mental model that the code does not provide. That cost multiplies every time someone new reads the code, or every time we return to it after a few weeks away.

Organizing fields by role

A cleaner approach assigns every field to one of four roles: input, control, context, or output. Each role answers a different question about the data. The following table describes each role and what belongs in it.

Role

What goes here

Written by

Input

Data provided at invocation time; never modified by nodes

The caller via app.invoke(...)

Control

Routing decisions and loop counters

Classifier nodes, quality evaluators

Context

Enrichment data fetched or computed during execution

Retrieval and preprocessing nodes

Output

The content the workflow is producing

Generation and finalization nodes

Organizing by ...