Llama Stack: From Fundamentals to Deployment/

...

Introduction to Agents

Define the concept of Llama Stack Agents and explain the benefits of using agents for building complex AI applications.

We'll cover the following...

Why use agents?
What is an agent in Llama Stack?
Components of an agent
- Putting it altogether
Sessions
Understanding turns in agent interaction
Running our agent
The agent execution loop

The developer can see every inference step, tool call, and shield result. So far, we’ve been using the Inference API to send prompts and receive responses. This is powerful, but it’s also limited since every interaction is a single query. There's no persistence, no memory, no external actions. We are relying on pure language modeling to solve all problems.

But real-world AI applications often need more. They need to reason across multiple steps, retrieve relevant information, invoke tools, and guard against harmful output while maintaining a conversational state.

That’s where agents come in.

Press + to interact

Llama Stack agents are self-contained systems that bring together all of these capabilities under a single abstraction. They’re the foundation for building intelligent assistants, domain-specific copilots, document-based Q&A systems, and more.

Why use agents?

Agents allow us to move beyond basic prompt engineering and into composable workflows. Instead of writing a long system prompt that tries to teach the model how to use tools or recall memory, we configure these behaviors explicitly.

Here are a few problems that agents solve:

You want your assistant to look things up in a document before answering.
You need to enforce input/output moderation before generating a reply.
You want to let the assistant call APIs or execute code.
You need to maintain a session across multiple user turns.
You want to observe and debug the internal steps of reasoning.

The agent system in Llama Stack makes all of these possible, with a consistent interface and clear semantics.

What is an agent in Llama Stack?

An agent in Llama Stack is a structured orchestration loop that wraps around the base inference API and adds:

Persistent session state
Access to memory (e.g. via vector databases)
Ability to use tools (such as search or code execution)
Safety shields (for filtering input/output)
Multi-step reasoning (with feedback loops between tools and inference)

Press + to interact

Getting Started with Llama Stack

Core Building Blocks: Architecture and Inference

Agents, Tools, and Retrieval with Llama Stack

Safety, Monitoring, and Evaluation

Advanced Integration and Beyond

Conclusion

Introduction to Agents

Why use agents?

What is an agent in Llama Stack?