Search⌘ K
AI Features

Assembling the Web Agent with Google ADK

Explore how to assemble a multimodal web agent with Google ADK by integrating browser session management, visual grounding tools, and orchestrating agent workflows. Understand operational modes including strict, single-agent, and multi-agent setups, and learn how to configure system prompts, loop agents, and fallback mechanisms to build reliable autonomous web systems.

In the previous lessons, we built a persistent browser session manager and a suite of multimodal web tools. However, tools alone cannot complete a task; they require an intelligent agent to orchestrate them.

In this final lesson, we dissect src/adk_agent/web_agent.py chunk by chunk. This file acts as the "brain" of the application. It resolves the LLM model, handles API versioning differences, and uses a factory function to build the root agent based on your environment configuration. Because Agentic Web Systems are complex, web_agent.py is designed to support three distinct operational modes:

  1. Strict mode: A heavily constrained, single-agent loop optimized for precise 1-to-7 numeric action commands.

  2. Single agent mode: A WebVoyager-style single agent using conversational tool calling.

  3. Multi-agent workflow: A delegated hierarchy where a Coordinator manages a Planner, a Vision agent, and a Browser agent.

The agent architecture flow

Before reading the code, let's visualize the control flow. The factory function checks the environment variables and builds the corresponding agent structure, ultimately wrapping the active agents in a LoopAgent to ensure the ReAct loop runs continuously until the task finishes.

The control flow of the Google ADK web agent. The build_root_agent() factory evaluates environment configurations to launch either a Strict, Single, or Multi-Agent system, ensuring continuous task execution by wrapping the active agents in a LoopAgent
The control flow of the Google ADK web agent. The build_root_agent() factory evaluates environment configurations to launch either a Strict, Single, or Multi-Agent system, ensuring continuous task execution by wrapping the active agents in a LoopAgent

Imports and helper functions

We begin by importing the tools we built in the previous lesson and defining helper functions to abstract away version differences in the Google ADK library.

"""Google ADK multi-agent builder for a multimodal web agent."""
from __future__ import annotations
import os
from dotenv import load_dotenv
from google.adk.agents import Agent, LoopAgent
from src.tools.web_tools import (
action_answer,
action_back,
action_click,
action_google,
action_input,
action_scroll,
action_wait,
analyze_screenshot_with_vlm,
capture_observation,
capture_multimodal_observation,
click_element,
close_browser,
log_step,
navigate,
go_back,
go_google,
reset_browser_task_state,
select_flight_dates,
search_web,
scroll_by,
type_text_element,
verify_task_completion,
wait,
)
def _create_agent(agent_cls, *, name: str, model: str, instruction: str, tools=None, sub_agents=None):
"""Create ADK agent while tolerating minor API differences between ADK versions."""
kwargs = {
"name": name,
"model": model,
"instruction": instruction,
}
if tools is not None:
kwargs["tools"] = tools
if sub_agents is not None:
kwargs["sub_agents"] = sub_agents
try:
return agent_cls(**kwargs)
except TypeError:
# Fallback for ADK variants that do not support sub_agents in ctor.
kwargs.pop("sub_agents", None)
return agent_cls(**kwargs)
def _create_workflow_agent(agent_cls, *, name: str, sub_agents, **extra):
"""Create ADK workflow agent while tolerating ctor differences."""
kwargs = {
"name": name,
"sub_agents": sub_agents,
**extra,
}
try:
return agent_cls(**kwargs)
except TypeError:
# Some variants may use children/agents instead of sub_agents.
kwargs.pop("sub_agents", None)
for field_name in ("agents", "children", "steps"):
try:
return agent_cls(name=name, **{field_name: sub_agents})
except TypeError:
continue
raise
def _resolve_llm_model() -> str:
"""Resolve LLM model for ADK agents.
Uses OpenAI only by default (reliable tool/function calling). Override with ``ADK_MODEL``
if you need another LiteLLM-supported id (advanced).
"""
explicit = os.getenv("ADK_MODEL", "").strip()
if explicit:
return explicit
model = os.getenv("ADK_OPENAI_MODEL", "openai/gpt-5-mini").strip()
return model or "openai/gpt-5-mini"
ADK compatibility wrappers, tool imports, and model resolution (web_ agent.py )
  • Lines 3–34: We import standard libraries (os), environment loaders (load_dotenv), ADK base classes (Agent, LoopAgent), and the exhaustive list of web tools we created previously in src/tools/web_tools.py.

  • Lines 37–54: _create_agent packs the name, model, and instruction into a kwargs dictionary. It conditionally adds tools and sub_agents to the arguments only if they are provided, keeping the instantiation clean. It attempts to instantiate the agent_cls with these arguments. If a TypeError occurs (often due to minor ADK version mismatches regarding sub-agent initialization), it pops sub_agents out of the dictionary and tries again as a safe fallback.

  • Lines 57–74: _create_workflow_agent prepares the arguments specifically for workflow agents (like loops). If the standard instantiation fails, it iterates through known alternative parameter names ("agents", "children", "steps") to ensure backward and forward compatibility with the ADK framework across different versions. ...