Search⌘ K
AI Features

Loop Control, Exit Conditions, and System Behavior

Explore how the Eureka-like reward learning system integrates components into a cohesive workflow. Understand loop assembly, agent orchestration, and how exit conditions control iteration. Learn practical methods for running the system on Google Colab or inspecting pre-run outputs, and gain skills to interpret logs, leaderboards, and rollouts for system evaluation and debugging.

In previous lessons, we implemented each component of the Eureka-like system in isolation:

  • Set up and environment initialization

  • Reward generation and evaluation

  • Selection and reflection

  • Feedback and exit control

  • Iteration state management

In this lesson, we step back and look at how the components fit together. The goal is not to introduce new agents or tools. Instead, we’ll trace where execution starts, how control flows across agents, how iterations repeat, and how outputs accumulate over time.

Where the full system is assembled

In our implementation, there are two assembly points:

  • agent.py → the workflow entrypoint ADK runs

  • reward_loop.py → the loop factory that builds the iterative reward evolution loop

Before diving into code, it’s important to read these files with the right mindset. These files answer questions like:

  • Which agent runs first?

  • Which agents repeat?

  • What happens when the exit condition is met?

It does not answer how rewards are generated, how PPO works, or how scoring is computed. All of that logic lives elsewhere.

Entrypoint: agent.py

When we run the project with ADK, execution begins from the file below:

"""ADK entrypoint for `adk run app/eureka_loop`"""
from google.adk.agents import SequentialAgent
from .agents.setup_agent import SetupAgent
from .agents.reward_loop import build_reward_loop
root_agent = SequentialAgent(
name="EurekaRewardEvolutionPipeline",
sub_agents=[
SetupAgent(name="SetupAgent"),
build_reward_loop(),
],
description="Initializes env/task state and runs an Eureka-like reward evolution loop."
)

Let’s read this in two steps:

Step A: The orchestration primitive is SequentialAgent

from google.adk.agents import SequentialAgent

We import SequentialAgent because the top-level workflow is strictly ordered. There’s no branching here. The flow is linear: run setup, then enter the loop.

Step B: root_agent is the runnable pipeline ADK executes

root_agent = SequentialAgent(
name="EurekaRewardEvolutionPipeline",
sub_agents=[
SetupAgent(name="SetupAgent"),
build_reward_loop(),
],
...
)

This sub_agents=[ ... ] list is the entire top-level execution plan:

  1. Run SetupAgent once.

  2. Then, run whatever build_reward_loop() returns (the iterative loop). ... ...