Modern AI agents are not monolithic programs. They consist of modular, configurable components that coordinate to implement perception, reasoning, and action. The core architecture includes three components: model, tools, and instructions.

The model

The model acts as the central reasoning component of the agent. It handles reasoning, planning, and decision-making. In modern agent architectures, this component is typically a large language model.

Model selection depends on:

Task complexity: Some tasks are straightforward, such as extracting values from text or generating summaries. For these, smaller models such as Mistral or Gemma may be sufficient. More complex tasks that require planning, judgment, or creativity tend to benefit from more capable models like GPT-4 or Claude Opus.
Latency requirements: In scenarios where users expect fast responses, such as live chats or interactive tools, we choose models that are optimized for low-latency performance. This helps maintain a smooth user experience.
Cost constraints: Running large models repeatedly can be expensive. When building agents that handle frequent requests, we consider cost per query, and may opt for more efficient models when possible. In some designs, we mix models by using smaller ones for routine steps and calling larger models only when needed.
Context window size: Agents that need to handle long conversations or documents require models that support extended context windows. Models like Claude 3 Sonnet and Gemini 1.5 are designed for these situations, and help the agent retain more relevant information.

The chosen model determines how effectively the agent can interpret inputs, reason through ambiguity, and plan multi-step actions.

Tools

Tools extend the agent’s capabilities beyond internal reasoning. While the model decides what to do, tools allow the agent to interact with external systems.

Tools may include:

APIs: To enable the agent to interact with external services such as weather systems, payment gateways, or messaging platforms.
Databases: To allow retrieval and storage of structured information such as user profiles, transaction history, or system records.
Search engines: To provide access to up-to-date or domain-specific information beyond the model’s training data.
Calculators: To perform precise numerical computations that require accuracy beyond language-based estimation.
File systems: To read, write, and manage documents or structured files as part of larger workflows.

For example, an agent answering a weather query may call a weather API rather than relying on its internal knowledge.

Instructions

Instructions define how the agent should behave, including role, constraints, and decision policies. These may include:

A natural language prompt that sets the task.
A system message that defines the agent’s role and personality.
A set of examples that demonstrate how to handle different scenarios.
Constraints, such as ethical boundaries or formatting rules.
Task definitions, such as “summarize,” “extract entities,” or “use tool A if X is true.”

This process of carefully crafting instructions, through natural language prompts and system messages, is referred to as prompt engineering, a critical skill in designing effective LLM-powered agents.

The agent control loop

Model, tools, and instructions define the core components, but an agent operates through a control loop that coordinates their interaction.

This loop differentiates agents from single model invocations. Instead of generating a single response, an agent evaluates outcomes, updates its decisions, and continues toward its objective. This iterative structure enables autonomy. Each cycle refines context, improves decision quality, and maintains continuity over time.

Memory systems

Agents require memory to operate effectively across multiple interactions and over time. Without memory, an agent behaves like a stateless model, responding only to immediate input.

Agents typically rely on three forms of memory:

Short-term memory: This form of memory holds recent context, typically within a single session or interaction. It allows the agent to keep track of what was just said or done. For example, in a multi-turn conversation, the agent remembers what the user asked two messages ago.
Long-term memory: This memory persists across sessions. It stores important information that the agent might need to refer back to in the future. For example, users’ preferences or settings, or notes about past decisions or outcomes.
External memory: This refers to the agent’s ability to look up relevant information from outside sources, such as documents, knowledge bases, or APIs.

Memory is not an isolated module. It is integrated into the perceive–recall–reason–act–store loop and affects both current decisions and future behavior.

Orchestration patterns

As tasks grow more complex, a single reasoning step is often insufficient. Agents must coordinate tool usage, planning, and decision-making across multiple steps. This coordination logic is known as orchestration.

Single-agent orchestration patterns

Modern agentic systems commonly use the following orchestration patterns:

Tool-calling loop: The agent reasons about whether a tool is needed, invokes it, observes the result, and continues reasoning.

Multi-agent architectures improve modularity and specialization but introduce coordination complexity. This orchestration layer determines how intelligence is structured across time, tasks, and agents.

Note: For a deeper exploration of agent orchestration patterns and multi-agent design strategies, refer to our dedicated course on Agentic Design Patterns.

Choosing between single-agent and multi-agent architectures

Not all problems require multiple agents. The architectural choice depends on task scope, required specialization, and coordination complexity. A single-agent architecture is sufficient when:

The task scope is narrow
Tool usage is limited
Decision-making remains centralized
Coordination overhead would add unnecessary complexity

A multi-agent architecture becomes beneficial when:

Tasks require specialized expertise
Workflows are modular and decomposable
Oversight, arbitration, or role separation is necessary
Scalability demands distributed coordination

Multi-agent systems improve modularity and separation of concerns but introduce additional communication and synchronization overhead.

Framework support

Several frameworks provide abstractions for implementing these orchestration patterns in production systems. Examples include:

These frameworks standardize agent loops, tool integration, memory management, and multi-agent coordination, which reduces development overhead and supports rapid experimentation. Together, architectural components, memory systems, and orchestration patterns define the core structure of modern agent-based systems. The next lesson covers how to make these systems safe, reliable, and production-ready.

1.Agent Design Fundamentals

2.Designing Real-World Agentic Systems

Breakout Session

3.Wrapping Up

Agent Architecture and Control Loops