Assembling the Web Agent with Google ADK
Explore how to assemble a multimodal web agent with Google ADK by integrating browser session management, visual grounding tools, and orchestrating agent workflows. Understand operational modes including strict, single-agent, and multi-agent setups, and learn how to configure system prompts, loop agents, and fallback mechanisms to build reliable autonomous web systems.
In the previous lessons, we built a persistent browser session manager and a suite of multimodal web tools. However, tools alone cannot complete a task; they require an intelligent agent to orchestrate them.
In this final lesson, we dissect src/adk_agent/web_agent.py chunk by chunk. This file acts as the "brain" of the application. It resolves the LLM model, handles API versioning differences, and uses a factory function to build the root agent based on your environment configuration. Because Agentic Web Systems are complex, web_agent.py is designed to support three distinct operational modes:
Strict mode: A heavily constrained, single-agent loop optimized for precise 1-to-7 numeric action commands.
Single agent mode: A WebVoyager-style single agent using conversational tool calling.
Multi-agent workflow: A delegated hierarchy where a Coordinator manages a Planner, a Vision agent, and a Browser agent.
The agent architecture flow
Before reading the code, let's visualize the control flow. The factory function checks the environment variables and builds the corresponding agent structure, ultimately wrapping the active agents in a LoopAgent to ensure the ReAct loop runs continuously until the task finishes.
Imports and helper functions
We begin by importing the tools we built in the previous lesson and defining helper functions to abstract away version differences in the Google ADK library.
"""Google ADK multi-agent builder for a multimodal web agent."""from __future__ import annotationsimport osfrom dotenv import load_dotenvfrom google.adk.agents import Agent, LoopAgentfrom src.tools.web_tools import (action_answer,action_back,action_click,action_google,action_input,action_scroll,action_wait,analyze_screenshot_with_vlm,capture_observation,capture_multimodal_observation,click_element,close_browser,log_step,navigate,go_back,go_google,reset_browser_task_state,select_flight_dates,search_web,scroll_by,type_text_element,verify_task_completion,wait,)def _create_agent(agent_cls, *, name: str, model: str, instruction: str, tools=None, sub_agents=None):"""Create ADK agent while tolerating minor API differences between ADK versions."""kwargs = {"name": name,"model": model,"instruction": instruction,}if tools is not None:kwargs["tools"] = toolsif sub_agents is not None:kwargs["sub_agents"] = sub_agentstry:return agent_cls(**kwargs)except TypeError:# Fallback for ADK variants that do not support sub_agents in ctor.kwargs.pop("sub_agents", None)return agent_cls(**kwargs)def _create_workflow_agent(agent_cls, *, name: str, sub_agents, **extra):"""Create ADK workflow agent while tolerating ctor differences."""kwargs = {"name": name,"sub_agents": sub_agents,**extra,}try:return agent_cls(**kwargs)except TypeError:# Some variants may use children/agents instead of sub_agents.kwargs.pop("sub_agents", None)for field_name in ("agents", "children", "steps"):try:return agent_cls(name=name, **{field_name: sub_agents})except TypeError:continueraisedef _resolve_llm_model() -> str:"""Resolve LLM model for ADK agents.Uses OpenAI only by default (reliable tool/function calling). Override with ``ADK_MODEL``if you need another LiteLLM-supported id (advanced)."""explicit = os.getenv("ADK_MODEL", "").strip()if explicit:return explicitmodel = os.getenv("ADK_OPENAI_MODEL", "openai/gpt-5-mini").strip()return model or "openai/gpt-5-mini"
Lines 3–34: We import standard libraries (
os), environment loaders (load_dotenv), ADK base classes (Agent,LoopAgent), and the exhaustive list of web tools we created previously insrc/tools/web_tools.py.Lines 37–54:
_create_agentpacks thename,model, andinstructioninto akwargsdictionary. It conditionally addstoolsandsub_agentsto the arguments only if they are provided, keeping the instantiation clean. It attempts to instantiate theagent_clswith these arguments. If aTypeErroroccurs (often due to minor ADK version mismatches regarding sub-agent initialization), it popssub_agentsout of the dictionary and tries again as a safe fallback.Lines 57–74:
_create_workflow_agentprepares the arguments specifically for workflow agents (like loops). If the standard instantiation fails, it iterates through known alternative parameter names ("agents","children","steps") to ensure backward and forward compatibility with the ADK framework across different versions. ...