Llama Stack: From Fundamentals to Deployment/

...

Monitoring and Telemetry

Learn how to monitor the behavior of agents and APIs in Llama Stack using its built-in telemetry system. Explore structured logs, metrics, and traces that help us debug, analyze, and optimize AI applications.

We'll cover the following...

Why telemetry matters
The Llama Stack telemetry system
Supported telemetry sinks
Tracing an agent turn
Querying telemetry via the API

Press + to interact

Llama Stack provides a built-in telemetry system that emits structured events throughout the execution of agents and APIs. These include logs, spans, and metrics. With telemetry enabled, we can:

Trace multi-step workflows like agent turns.
Debug tool calls and safety violations.
Measure inference latency and tool usage.
Store structured traces locally or forward them to observability platforms.

This lesson will show how to configure telemetry, interpret its output, and use it to improve our applications.

Why telemetry matters

In traditional software systems, observability is critical: we need to know what the system did, how long it took, and where it failed. GenAI applications are no different. Without visibility into inference steps, tool calls, and memory ...

Getting Started with Llama Stack

Core Building Blocks: Architecture and Inference

Agents, Tools, and Retrieval with Llama Stack

Safety, Monitoring, and Evaluation

Advanced Integration and Beyond

Conclusion

Monitoring and Telemetry

Why telemetry matters