Search⌘ K
AI Features

System Design: LLM-Powered Customer Support Bot

Define the requirements and resource estimates for an LLM-powered customer support bot. Learn how functional requirements like RAG-based generation and context-aware dialogue combine with non-functional requirements for low latency and cost efficiency. This foundation prepares learners to design scalable, production-ready conversational AI systems.

Traditional rule-based chatbots break down when user input falls outside predefined scripts, often returning fallback or nonspecific responses. This degrades the user experience and increases reliance on human escalation, which raises operational costs.

An LLM-powered customer support bot addresses this gap by leveraging large language models to interpret intent, maintain conversational context across multiple exchanges, and generate natural responses. Modern production architectures go further by combining LLMs with retrieval-augmented generation (RAG)Retrieval-augmented generation (RAG) is an AI technique where a model retrieves relevant information from an external data source and uses it to generate more accurate responses., which grounds every response in company-specific knowledge bases and real-time data rather than relying on the model’s static training data. This significantly reduces hallucination rates and improves response accuracy. In this chapter, we design such a system.

This lesson establishes three things: functional requirements, nonfunctional requirements, and resource estimation for designing an LLM-powered customer support bot.

Let’s start with the functional requirements.

Functional requirements

The following functional requirements define the system’s core behavior:

  • Dialogue management: The system must maintain a multi-turn conversation history so that follow-up questions like “What about the other item?” are interpreted correctly within the ongoing session rather than treated as isolated queries.

  • Natural language understanding: The system must interpret user intent, extract key entities such as order IDs and product names, and disambiguate vague queries using contextual cues from the conversation.

  • Response generation: The system must fetch relevant documents from a company’s knowledge base or ...