...

/

Designing Agent Workflows with Llama Stack

Designing Agent Workflows with Llama Stack

Learn how to implement powerful multi-agent workflows in Llama Stack, using real-world orchestration patterns like routing, chaining, parallel execution, and evaluator optimizer loops.

As we build more capable AI systems, we often need agents to collaborate, specialize, and monitor each other, just like teams of humans working together. In this lesson, we’ll build on our understanding of individual agents and show how to construct robust multi-agent workflows using Llama Stack’s flexible SDK and session model. We will implement a few popular agent workflows inspired by industry best practices, adapted for Llama Stack. Each example will include end-to-end logic and use structured monitoring so we can trace how the agents collaborate.

Pattern 1: Prompt chaining

Prompt chaining is the simplest and most reliable agent workflow. It sequences multiple LLM calls, with each step building directly on the previous one. Instead of trying to solve a problem in a single prompt, we decompose the task and pass intermediate outputs forward through a session.

This approach avoids complex orchestration and is especially effective when:

  • The overall task can be broken into well-scoped subtasks

  • Each subtask has a clear prompt format

  • Deterministic flow is more important than flexibility

Press + to interact

Let’s imagine we’re building a multi-step writing assistant. The user provides a long-form technical paragraph. Our goal is to:

  1. Summarize it. ...