Llama Stack: From Fundamentals to Deployment/

...

Designing Agent Workflows with Llama Stack

Learn how to implement powerful multi-agent workflows in Llama Stack, using real-world orchestration patterns like routing, chaining, parallel execution, and evaluator optimizer loops.

We'll cover the following...

Pattern 1: Prompt chaining
- Pattern 2: Routing to specialized agents
- Pattern 3: Parallelization for faster task execution

As we build more capable AI systems, we often need agents to collaborate, specialize, and monitor each other, just like teams of humans working together. In this lesson, we’ll build on our understanding of individual agents and show how to construct robust multi-agent workflows using Llama Stack’s flexible SDK and session model. We will implement a few popular agent workflows inspired by industry best practices, adapted for Llama Stack. Each example will include end-to-end logic and use structured monitoring so we can trace how the agents collaborate.

Pattern 1: Prompt chaining

Prompt chaining is the simplest and most reliable agent workflow. It sequences multiple LLM calls, with each step building directly on the previous one. Instead of trying to solve a problem in a single prompt, we decompose the task and pass intermediate outputs forward through a session.

This approach avoids complex orchestration and is especially effective when:

The overall task can be broken into well-scoped subtasks
Each subtask has a clear prompt format
Deterministic flow is more important than flexibility

Press + to interact

Getting Started with Llama Stack

Core Building Blocks: Architecture and Inference

Agents, Tools, and Retrieval with Llama Stack

Safety, Monitoring, and Evaluation

Advanced Integration and Beyond

Conclusion

Designing Agent Workflows with Llama Stack

Pattern 1: Prompt chaining