It's not uncommon for developers to waste time jumping between different monitoring screens. When something breaks, they jump from CloudWatch to X-Ray, from basic logs to alarms, eventually losing focus and slowing down their work. This scattered data makes figuring out what’s wrong difficult and increases the dreaded Mean Time To Resolution (MTTR).
The solution isn’t more dashboards, it’s intelligence in your terminal.
The Model Context Protocol (MCP) servers for CloudWatch and Application Signals are powerful local agents that link rich, detailed operational data directly into your developer tools (like your IDE or CLI) using simple, normal questions.
This newsletter walks you through the Model Context Protocol (MCP) servers, the AI bridge directly connecting CloudWatch and Application Signals data to your CLI. We’ll specifically discuss:
What CloudWatch and Application Signals MCP servers are and how they differ.
The core mechanism behind natural language queries transforming into deep AWS insights.
How to deploy a sample application using Amazon Q CLI.
A hands-on demonstration of rapid incident triage using Amazon Q CLI and MCP servers.
The MCP is an open protocol that standardizes how AI applications access external tools. AWS uses specialized MCP servers to interface with its observability services:
MCP Server | Primary Focus | Primary Use |
CloudWatch MCP Server | Resource-level metrics, logs, and alarms (e.g., CPU, memory, log streams) | Analyzing raw telemetry, complex log patterns (via Logs Insights), and general infrastructure health. |
Application Signals MCP Server | Service-level health, SLO compliance, and distributed tracing (via X-Ray) | Understanding application performance, identifying bottlenecks, and root cause analysis against business goals. |
The workflow highlights how the MCP acts as a powerful translation layer, eliminating manual API structuring and data correlation.
Natural language query (Client): The user enters a descriptive prompt in the Amazon Q CLI (e.g., “What is the health status of my PaymentService?”).
Intelligent routing (Amazon Q): Amazon Q identifies the query’s intent (monitoring service health) and selects the appropriate tool, the Application Signals MCP server. This routing decision is based on the request’s semantic meaning.
API translation and execution (MCP server):
The MCP server receives the request and executes a sequence of highly optimized, built-in AWS API calls (e.g., applicationsignals:ListServiceLevelObjectives and xray:GetTraceSummaries).
This translation is critical as it converts a single, vague question into multiple, precise, and structured API requests. These handle the required parameters like time ranges, namespaces, and dimensions automatically.
Data correlation and return (MCP server and LLM):
The MCP server collects the raw JSON responses from AWS (e.g., metric data, trace IDs).
It then passes this structured data, along with its defined tool schema, to the underlying large language model (LLM).
The LLM synthesizes the recent errors found in the traces and generates a human-readable summary with key metrics and actionable findings.
In the above illustration, the GetMetrics function is using the applicationsignals:ListServiceLevelObjectives API and GetTraces function is using the xray:GetTraceSummaries API.
Before exploring into hands-on work, let’s prepare the environment that ensures that the Model Context Protocol (MCP) servers can seamlessly connect CloudWatch and Application Signals data to your CLI. You’ll configure AWS credentials, create an MCP configuration file, and enable the required MCP servers.
Note: Make sure that the Amazon Q Developer CLI is already installed in your local system.
The MCP servers need permission to access AWS services such as CloudWatch, Lambda, and the Cloud Control API. To do this, you’ll set up AWS credentials through a named profile.
Note: The AWS Command-Line Interface (CLI) is required for this setup. If it is not already installed, please refer to the official AWS documentation for installation instructions.
To configure the AWS credentials, execute the following command:
You will be prompted to enter an access key ID, secret access key, default region, and output format. For the region, you can safely enter us-east-1.
Note: Need help creating keys? Follow this step-by-step guide on generating AWS access keys.
Ensure that your AWS CLI credentials have the permissions for Lambda, IAM, CloudWatch, Application Signals, and the Cloud Control API (for deployment).
Now that your credentials are ready, the next step is to enable the MCP servers. These act as the communication bridge between your local environment and AWS observability services.
Before enabling them, you’ll need to create a configuration file named mcp.json.
The mcp.json file is a local configuration file that defines which MCP servers should be active in your environment. Each server corresponds to a specific AWS service integration, for example, CloudWatch, Application Signals, and Cloud Control API.
This file allows the Amazon Q Developer CLI to automatically discover and connect to these services, making observability data accessible directly from your command line. Without this configuration, the MCP servers remain disabled and can’t stream data to your CLI tools.
Start by creating the ~/.aws/amazonq directory using this command:
Note: Ensure that nano is installed on your terminal before running the command below. You can install it using your package manager (sudo apt install nano for Ubuntu/Debian or brew install nano for macOS).
Next, create the mcp.json file using the command given below:
This command will create the file and open it for editing.
Paste the configurations given below in your terminal and save the changes to the mcp.json file:
{"mcpServers": [{ "name": "cloudwatch", "enabled": true },{ "name": "application-signals", "enabled": true },{ "name": "cloud-control-api", "enabled": true }]}
The mcp.json file is the configuration blueprint that informs the Amazon Q Developer about which Model Context Protocol (MCP) servers to activate.
Each entry represents a specific AWS integration.
CloudWatch: Streams monitoring metrics and logs.
Application Signals: Provides telemetry data for applications.
Cloud Control API: Enables dynamic control and management of AWS resources.
By setting "enabled": true for each service, you ensure that your local CLI environment can automatically fetch and interact with real-time AWS observability data.
Before we deploy and monitor an application, let’s first create a simple AWS Lambda function that mimics a real-world payment processing service. This function intentionally introduces random delays and simulated faults to generate meaningful observability data.
The core logic injects two types of behaviors:
Critical faults (10%): Represents dependency failures or unexpected crashes.
High latency (20%): Simulates slow external service calls or database queries.
This variability ensures that CloudWatch and Application Signals can capture a realistic range of performance metrics, logs, and errors.
Create a file named lambda_function.py locally. Add the following Python code to the file:
Lines 13–18: Simulate a 10% failure rate to represent service crashes or dependency faults.
Lines 21–27: Add artificial latency between 2–4 seconds to emulate slow responses.
Lines 29–32: Handle the normal processing flow and return a successful 200 response.
We’ll start deploying our application using the Amazon Q CLI. For this, enable the chat mode of Amazon Q using the following command:
Once the chat mode is active, you can interact with Amazon Q to automate deployment and configuration tasks.
deploy the PaymentServiceLambda function with python3.13 runtime to AWSusing the local lambda_function.py file, ensuring Application Signals andX-Ray tracing are enabled for it. Also, create the necessary IAM roleand also make sure and no LabBoundaryPolicy restrict Application Signals and X-Ray tracing.
Amazon Q uses the
Leave your Q CLI chat session open. In a separate local terminal, run the invocation script for five minutes.
export LAMBDA_NAME=PaymentServiceLambda# Run 300 invocations over 5 minutesfor i in {1..300}; doaws lambda invoke --function-name $LAMBDA_NAME --payload '{"amount": 100}' response.json & sleep 1;doneecho "Traffic generation finished. Data is now available for Q."
Return to your Amazon Q CLI chat session. The service is now failing, and we’ll triage it in four commands.
The first step in any incident is to quantify the user impact by checking the s
Prompt: what is the current health and SLO status for the PaymentServiceLambda?
Output:
The Application Signals MCP server processes this request, correlating performance and configuration data from multiple sources. It instantly returns a diagnosis showing the results mentioned below.
Recent performance: The last 5 minutes show 40 total invocations with 10 total errors, resulting in a critical error rate of 25%. The max duration is 3,000ms (3 seconds), which suggests timeouts are occurring (matching your simulated delay).
SLO failure: Under current SLO status, the service is flagged with a critical failure. The current error rate of 25% is 250x higher than the 0.1% target, and the high latency is causing timeouts.
This step confirms a major business failure (SLO breach) instantly, leveraging the Application Signals MCP to provide the exact metrics (rate and duration) and the severity (critical failure) needed to prioritize the incident. This immediately directs the developer away from general infrastructure checks toward the application code.
Note: The invocation counts, error rates, and duration metrics displayed are dynamically generated by the code's random fault injection and will fluctuate with each test run. However, the diagnostic pattern remains constant.
Now that the failure is confirmed, we use distributed tracing to look inside the application and find the cause of the poor performance.
Prompt: query the sampled traces for PaymentServiceLambda faults in the last 5 minutes and summarize the errors.
Output:
The Application Signals MCP server analyzes 40+ sampled traces from the PaymentServiceLambda and provides a detailed performance summary.
Trace summary: The query results show HasFault: true = 0 and HasError: true = 6, confirming that no Lambda service-level faults occurred. Instead, about 15% of traces recorded application-level timeout errors as the function logic ran, but exceeded the configured timeout threshold.
Trace duration analysis: Error traces lasted between 3.008s and 4.004s, matching the Lambda’s 3-second timeout configuration. This pattern directly aligns with the simulated delay intentionally introduced in the code, revealing the reason behind the timeouts.
Performance insights: Most successful traces completed within 10–100 ms, while error traces took 3+ seconds, resulting in an overall 15% error rate and an SLO breach. The longer execution times are tied to application-level latency, not infrastructure issues.
The MCP server correlates X-Ray data with function execution details, pinpointing the root cause. This is a timeout triggered by the interaction between the Lambda’s configuration limit and the simulated delay in the code.
To confirm the exact nature of the faults identified in the traces, we pivot to the raw logs using the CloudWatch MCP Server.
Prompt: search the logs for PaymentServiceLambda for all 'Payment processing failed' messages and provide a summary.
Output:
The CloudWatch MCP server successfully executed a complex Logs Insights query and synthesized the findings as mentioned below.
Total events and pattern: A total of 12 errors were found, all sharing the consistent message, “Payment processing failed: Critical dependency unavailable” at the ERROR log level.
Chronology: The errors were heavily concentrated in the peak error period (13:14-13:16 UTC), showing 9 failures in a short span.
Analysis: The tool confirms that the simulation aligns with the expected 10% fault injection and notes that the errors are spread across multiple log streams. This indicates that the failure occurred across different Lambda execution environments.
The CloudWatch MCP converts natural language into a precise Logs Insights query, eliminating manual console filtering. It provides the exact error message and timeline breakdown needed for the developer to replicate and fix the specific dependency issue.
The final step is to ensure that this isolated service failure is not part of a larger infrastructure problem (like an entire region having database issues).
Prompt: are there any active alarms on PaymentServiceLambda lambda function?
Output:
The CloudWatch MCP server responds to the infrastructure health check by querying the account’s registered metric and composite alarms. The response indicates that no alarms are currently active for the PaymentServiceLambda function, which means there are no automated alerts set up to detect performance issues or failures. This highlights a gap in proactive monitoring, as the function could experience problems without triggering any notifications.
This result confirms a quick “all-clear” on the wider environment. It’s a crucial finding that allows the developer to confidently focus remediation efforts purely on the Lambda function’s code and configuration. This eliminates the need to waste time investigating external dependencies.
The power of the Model Context Protocol (MCP) servers is clear. You just performed a full, four-step incident triage, from high-level business impact to low-level log analysis, in minutes, without ever leaving your terminal or switching context.
This integration transforms observability from a manual, console-based task into an intelligent, conversational workflow. It brings DevOps into a new era, where your AI assistant becomes the quickest and first responder to production issues.
Ultimately, the Model Context Protocol (MCP) servers deliver a tangible business advantage. This system results in faster incident triage because it reduces Mean Time To Resolution (MTTR). By putting all relevant contextual data directly at the developer’s fingertips, the solution fosters enhanced team collaboration. It also enables engineers to make smarter, faster decisions about application health and remediation.
For more on this topic, explore the following cloud labs: