Search⌘ K
AI Features

Monitoring and Telemetry

Explore how to configure and utilize telemetry in Llama Stack to gain visibility into agent workflows, tool calls, and safety checks. Learn to interpret telemetry data, trace multi-step processes, and debug or evaluate AI applications effectively for improved reliability and performance.

As we build more complex systems incorporating agents, tools, safety measures, and retrieval mechanisms, understanding the internal workings becomes increasingly challenging. An agent might retrieve an incorrect document, call a tool improperly, or fail to complete a turn. Visibility is crucial, not just into the final output, but into every intermediate step.

That’s where telemetry comes in.

Llama Stack provides a built-in telemetry system that emits structured events throughout the execution of agents and APIs. These include logs, spans, and metrics. With telemetry enabled, we can:

  • Trace multi-step workflows like agent turns.

  • Debug tool calls and safety violations.

  • Measure inference latency and tool usage.

  • Store structured traces locally or forward them to observability platforms.

This lesson will show how to configure telemetry, interpret its output, and use it to improve our applications.

Why telemetry matters

In traditional software systems, observability is critical: we need to know what the system did, how long it took, and where it failed. GenAI applications are no different. Without visibility into inference steps, tool calls, and memory ...