Search⌘ K
AI Features

Introduction to Bedrock Guardrails

Explore how to implement and configure Amazon Bedrock Guardrails to secure generative AI applications. Understand content filters, denied topics, word filters, and PII redaction for protecting users and maintaining compliance. Learn testing workflows to ensure safety policies are effective before production deployment.

Generative AI applications often face a core trade-off: the same open-ended generation and task planning that make foundation models useful can also make outputs harder to control. A model that drafts legal summaries can also generate unsupported case citations. A chatbot that handles customer queries can also expose sensitive data if the application does not defend against prompt injection or unsafe retrieval. Guardrails for Amazon Bedrock help address these risks by providing a managed policy enforcement layer that evaluates requests and responses during the model invocation flow. When enabled for an invocation, guardrails evaluate both inputs and outputs against configured policies for content filters, denied topics, sensitive information, and contextual grounding.

A useful analogy is an airport security checkpoint. When guardrails are enabled, each passenger (or prompt) is screened before boarding or reaching the model. Each piece of luggage, or response, is inspected before leaving the terminal or reaching the user. The checkpoint is enforced outside the prompt itself, making it harder to bypass the prompt solely by changing its wording.

When a rule is violated, Guardrails can take one of three actions depending on your configuration. It can block the content entirely and return a predefined message to the user. It can mask sensitive information by replacing detected entities with placeholders while allowing the rest of the content through. Or it can log the violation for audit without interrupting the interaction. Because enforcement occurs server-side within the AWS infrastructure, these policies remain effective even against sophisticated prompt-injection attacks that attempt to override system instructions.

Note: Guardrails are the AWS-recommended approach for implementing safety measures across all Bedrock integration patterns, including Agents, Knowledge Bases, and direct model invocations via the Converse API.

The full feature set spans four policy modules that work together to provide defense-in-depth. Content filters classify harmful content across six categories. Denied topics block entire subject areas using natural language descriptions. Word filters catch specific prohibited terms. PII redaction detects and protects sensitive data patterns. Each module can be configured independently, and the following sections break down how to use them effectively.

The diagram below illustrates how these four modules are enforced at both the input and output stages of every model invocation.

Guardrails application to input and output
Guardrails application to input and output

With this architecture in mind, the next step is understanding how each policy module works and how to configure it for your application’s risk profile.

Content filters and sensitivity thresholds

Bedrock Guardrails evaluate content against six harm categories, each targeting a distinct class of unsafe material: ...