Implementing RAG Server Using MCP
Learn how to build a RAG server with MCP, LangChain, and ChromaDB, allowing an agent to answer questions from a knowledge base.
We have successfully taught our agent to use tools that interact with live web services, but what happens when the information it needs isn’t on the public web? LLMs have vast general knowledge, yet they are completely unaware of your company’s internal documents, project plans, or proprietary data. This lesson tackles that fundamental challenge by building a complete retrieval-augmented generation (RAG) pipeline, a system that allows an agent to find and read from specific documents, and exposing this powerful capability as a self-contained MCP server.
Giving our agent a knowledge base
An agent’s ability to use external tools is a powerful starting point, but it’s fundamentally limited by the public nature of most APIs. If we ask it about our company’s internal vacation policy or the procedure for requesting new hardware, it would fail, as this private knowledge wasn’t part of its training data. To address this limitation, we must move beyond basic tool invocation and equip our agent with a dedicated private library, a searchable knowledge base that it can read from to answer questions with greater accuracy and context-awareness.To understand how we can achieve this, let’s consider a more advanced use case.
Scenario: Building a corporate document assistant
Imagine a new employee joins our company and has dozens of questions: “What is the policy on remote work?” “How many vacation days do I get?”, “What are the company’s core values?”. Answering these manually consumes valuable time from HR and team leads. The company has a comprehensive employee_handbook.txt
file for this purpose. Here is a glimpse of what the employee handbook file looks like:
Q: What are the official working hours?A: Our standard working hours are 9:00 AM to 5:00 PM from Monday to Friday, with a one-hour lunch break.Q: Can I work remotely full-time?A: Yes, our company supports fully remote roles where applicable. Your manager will confirm if your position qualifies for full-time remote work. Hybrid options are also available.Q: How do I apply for annual leave?A: Leave applications can be submitted through the HR portal. Please apply at least 3 days in advance and wait for manager approval.Q: Is there a dress code while working remotely?A: No formal dress code is required for remote work. However, employees are encouraged to dress appropriately for video meetings.Q: How often are team meetings held?A: Most teams conduct a weekly sync-up meeting. Your manager will share the schedule and meeting format.Q: How are public holidays observed in remote roles?A: Public holidays are observed as per your location or local calendar, based on your region and employment agreement.Q: What should I do if I experience internet issues during work?A: Notify your manager or team as soon as possible via email or messaging app. Try to reconnect or switch to a backup connection if available.Q: Are flexible working hours allowed?A: Yes, flexible working hours are permitted as long as you remain aligned with your team’s availability and complete your assigned tasks.Q: How do I receive my monthly salary slip?A: Salary slips are automatically generated and shared via the employee portal at the end of each month.Q: What tools should I use for remote collaboration?A: We recommend Slack or Teams for communication, Zoom or Google Meet for video calls, and Trello, Notion, or Confluence for project tracking and documentation.Q: Can I shift my work hours temporarily due to personal reasons?A: You may adjust your work hours temporarily with prior approval from your manager, provided it doesn’t affect team coordination or deadlines.Q: What’s the policy on attending virtual meetings?A: Employees are expected to attend scheduled meetings on time, actively participate, and turn on cameras when appropriate.Q: Where can I find the official leave calendar?A: The official leave and holiday calendar is available on the company’s HR portal under the "Time Off" section.Q: How is performance reviewed while working remotely?A: Performance reviews are conducted semi-annually. They are based on goal completion, collaboration, communication, and feedback from peers and managers.Q: Are there any wellness breaks during the day?A: Employees are encouraged to take short breaks throughout the day to rest and recharge. Managers may also schedule virtual coffee chats or non-work check-ins.Q: How do I stay connected with my team remotely?A: Frequent communication through chat apps, regular check-ins, and project tracking tools help keep everyone aligned and connected.Q: What’s the company’s stance on overtime?A: Employees are encouraged to maintain work-life balance. Overtime should only occur when necessary and must be approved by the manager in advance.Q: How do I raise a technical support request?A: You can log a ticket via the internal Helpdesk system, or email the IT support team with a description of your issue.Q: Can I work from a different city or country temporarily?A: Temporary relocation requests should be discussed with your manager and HR. They depend on time zone overlap, legal considerations, and team needs.Q: How do I keep track of my tasks while remote?A: Use your team’s preferred task management tool. Common choices include Jira, Asana, or a shared Google Sheet for simple tracking.Q: How do I report time off or sick days?A: Inform your manager directly and log your absence in the HR portal. For sick leave, a medical note may be requested for absences longer than 2 days.Q: Is there training for remote work tools?A: Yes, onboarding sessions include tool training, and additional guides are available in the internal knowledge base.Q: Are my work hours tracked?A: Work hours are not tracked minute-by-minute, but employees are expected to fulfill their responsibilities and attend key meetings.Q: What should I do if I miss a meeting?A: Notify your team, review the meeting notes or recording if available, and follow up on any assigned action items.Q: Are team-building activities offered remotely?A: Yes, virtual team-building events such as games, quizzes, and casual chats are held regularly to encourage collaboration and connection.Q: What’s the best way to ask for feedback remotely?A: You can schedule one-on-one time with your manager or ask for written feedback on completed projects.Q: How do I update my personal information in records?A: Log in to the HR portal and navigate to your profile settings. You can update your address, contact number, or emergency contact info there.Q: Can I split my workday into two parts?A: Split shifts may be allowed if they suit your role and team’s coordination. Please discuss with your manager before making any changes.Q: How do I receive company-wide announcements while remote?A: Important updates are shared through official email, the HR portal, and sometimes through a pinned message in the company-wide chat channel.Q: What are some tips to stay productive at home?A: Maintain a consistent routine, use a dedicated workspace, avoid multitasking, and schedule regular breaks to stay focused and energized.
Expecting new hires to read it cover-to-cover is unrealistic. Our objective is to build an intelligent assistant that acts as an expert on this handbook. When an employee asks a question, the agent must not use its general, pretrained knowledge. Instead, it must find the most relevant section within the handbook and use only that information to construct its answer. The technical solution for this challenge is a powerful technique known as retrieval-augmented generation (RAG), which ensures responses are always accurate, verifiable, and grounded in official company policy.
A quick look at the RAG workflow
Fundamentally, retrieval-augmented generation (RAG) is a technique used to make LLM responses more reliable and fact-based by connecting them to an external knowledge source. Instead of letting the LLM answer from its generalized pretrained knowledge, a RAG system first fetches relevant information and provides it to the LLM as direct context for generating a response. This process dramatically reduces hallucinations and ensures the answers are grounded in the specific data we provide.
For our document assistant, this workflow consists of two primary stages that happen ...