Llama Stack: From Fundamentals to Deployment/

...

Implementing Safety Guardrails

Learn how to use Llama Stack’s Safety API to filter potentially harmful content. Register and apply safety shields to agents, protecting both input and output through a structured, provider-based moderation system.

We'll cover the following...

Why safety matters
How Llama Stack enforces safety
Registering a safety shield
Safety for standalone inference
Best practices for safety

That’s why Llama Stack includes a built-in Safety API and a system of configurable shields that allow developers to enforce safety guardrails at multiple points in the interaction pipeline.

In this lesson, we’ll learn how to register and apply a shield like llama_guard, attach it to an agent, and observe how unsafe content is intercepted before it can be processed or returned. These tools help us build more trustworthy, responsible applications.

Why safety matters

Even well-designed prompts and helpful models can produce unsafe or inappropriate outputs under the right conditions. Consider the following risks:

Toxicity: Hate speech, slurs, or personal attacks
Violence: Descriptions or encouragement of harm
Self-harm: Responses to mental health questions that are inaccurate or unsafe ...

Getting Started with Llama Stack

Core Building Blocks: Architecture and Inference

Agents, Tools, and Retrieval with Llama Stack

Safety, Monitoring, and Evaluation

Advanced Integration and Beyond

Conclusion

Implementing Safety Guardrails

Why safety matters