Assembling the Web Agent with Google ADK
Explore how to build a multimodal web agent leveraging the Google Agent Development Kit. Understand the architecture and orchestration of strict, single-agent, and multi-agent modes while integrating visual grounding and continuous interaction loops. This lesson guides you through assembling the core web agent, managing toolsets, and handling agent workflows for reliable and adaptive web navigation.
In the previous lessons, we built a persistent browser session manager and a suite of multimodal web tools. However, tools alone cannot complete a task; they require an intelligent agent to orchestrate them.
In this final lesson, we dissect src/adk_agent/web_agent.py chunk by chunk. This file acts as the "brain" of the application. It resolves the LLM model, handles API versioning differences, and uses a factory function to build the root agent based on your environment configuration. Because agentic web systems are complex, web_agent.py is designed to support three distinct operational modes:
Strict mode: A heavily constrained, single-agent loop optimized for precise one-to-seven numeric action commands.
Single-agent mode: A WebVoyager-style single agent that uses conversational tool calling.
Multi-agent workflow: A delegated hierarchy in which a Coordinator manages a Planner, a Vision agent, and a Browser agent.
The agent architecture flow
Before reading the code, let's visualize the control flow. The factory function checks the environment variables and builds the corresponding agent structure, ultimately wrapping the active agents in a LoopAgent so the ReAct loop runs continuously until the task finishes.
Imports and helper functions
We begin by importing the tools we built in the previous lesson and defining helper functions that abstract away version differences in the Google ADK library.
"""Google ADK multi-agent builder for a multimodal web agent."""from __future__ import annotationsimport osfrom dotenv import load_dotenvfrom google.adk.agents import Agent, LoopAgentfrom src.tools.web_tools import (action_answer,action_back,action_click,action_google,action_input,action_scroll,action_wait,analyze_screenshot_with_vlm,capture_observation,capture_multimodal_observation,click_element,close_browser,log_step,navigate,go_back,go_google,reset_browser_task_state,select_flight_dates,search_web,scroll_by,type_text_element,verify_task_completion,wait,)def _create_agent(agent_cls, *, name: str, model: str, instruction: str, tools=None, sub_agents=None):"""Create ADK agent while tolerating minor API differences between ADK versions."""kwargs = {"name": name,"model": model,"instruction": instruction,}if tools is not None:kwargs["tools"] = toolsif sub_agents is not None:kwargs["sub_agents"] = sub_agentstry:return agent_cls(**kwargs)except TypeError:# Fallback for ADK variants that do not support sub_agents in ctor.kwargs.pop("sub_agents", None)return agent_cls(**kwargs)def _create_workflow_agent(agent_cls, *, name: str, sub_agents, **extra):"""Create ADK workflow agent while tolerating ctor differences."""kwargs = {"name": name,"sub_agents": sub_agents,**extra,}try:return agent_cls(**kwargs)except TypeError:# Some variants may use children/agents instead of sub_agents.kwargs.pop("sub_agents", None)for field_name in ("agents", "children", "steps"):try:return agent_cls(name=name, **{field_name: sub_agents})except TypeError:continueraisedef _resolve_llm_model() -> str:"""Resolve LLM model for ADK agents.Uses OpenAI only by default (reliable tool/function calling). Override with ``ADK_MODEL``if you need another LiteLLM-supported id (advanced)."""explicit = os.getenv("ADK_MODEL", "").strip()if explicit:return explicitmodel = os.getenv("ADK_OPENAI_MODEL", "openai/gpt-5-mini").strip()return model or "openai/gpt-5-mini"
Lines 3–34: We import standard libraries (
os), environment loaders (load_dotenv), ADK base classes (Agent,LoopAgent), and the exhaustive list of web tools we created previously insrc/tools/web_tools.py.Lines 37–54:
_create_agentpacksname,model, andinstructioninto akwargsdictionary. It conditionally addstoolsandsub_agentsonly if they are provided, which keeps instantiation clean. It then attempts to instantiateagent_clswith these arguments. If aTypeErroroccurs, often due to minor ADK version mismatches around sub-agent initialization, it removessub_agentsfrom the dictionary and tries again as a safe fallback.Lines 57–74:
_create_workflow_agentprepares arguments specifically for workflow agents, such as loops. If standard instantiation fails, it iterates through known alternative parameter names ("agents","children","steps") to maintain backward and forward compatibility with the ADK framework across versions. ...