Introduction to Bedrock Guardrails

Explore how to implement and configure Amazon Bedrock Guardrails to secure generative AI applications. Understand content filters, denied topics, word filters, and PII redaction for protecting users and maintaining compliance. Learn testing workflows to ensure safety policies are effective before production deployment.

We'll cover the following...

Content filters and sensitivity thresholds
- Threshold levels and the false positive trade-off
Denied topics and word filters
- Writing effective denied topic descriptions
- Word filters for exact-match blocking
PII redaction for data protection
Applying Guardrails and testing with the API
- Integration points
- Testing with the ApplyGuardrail API
Conclusion

Generative AI applications often face a core trade-off: the same open-ended generation and task planning that make foundation models useful can also make outputs harder to control. A model that drafts legal summaries can also generate unsupported case citations. A chatbot that handles customer queries can also expose sensitive data if the application does not defend against prompt injection or unsafe retrieval. Guardrails for Amazon Bedrock help address these risks by providing a managed policy enforcement layer that evaluates requests and responses during the model invocation flow. When enabled for an invocation, guardrails evaluate both inputs and outputs against configured policies for content filters, denied topics, sensitive information, and contextual grounding.

A useful analogy is an airport security checkpoint. When guardrails are enabled, each passenger (or prompt) is screened before boarding or reaching the model. Each piece of luggage, or response, is inspected before leaving the terminal or reaching the user. The checkpoint is enforced outside the prompt itself, making it harder to bypass the prompt solely by changing its wording.

When a rule is violated, Guardrails can take one of three actions depending on your configuration. It can block the content entirely and return a predefined message to the user. It can mask sensitive information by replacing detected entities with placeholders while allowing the rest of the content through. Or it can log the violation for audit without interrupting the interaction. Because enforcement occurs server-side within the AWS infrastructure, these policies remain effective even against sophisticated prompt-injection attacks that attempt to override system instructions.

Note: Guardrails are the AWS-recommended approach for implementing safety measures across all Bedrock integration patterns, including Agents, Knowledge Bases, and direct model invocations via the Converse API.

The full feature set spans four policy modules that work together to provide defense-in-depth. Content filters classify harmful content across six categories. Denied topics block entire subject areas using natural language descriptions. Word filters catch specific prohibited terms. PII redaction detects and protects sensitive data patterns. Each module can be configured independently, and the following sections break down how to use them effectively.

The diagram below illustrates how these four modules are enforced at both the input and output stages of every model invocation.

1.Introduction

2.Prompt Engineering and Model Selection

Cloud Lab

Cloud Lab

3.Customizing Models and Knowledge Retrieval

Cloud Lab

Cloud Lab

4.Building AI Agents with Amazon Bedrock

Cloud Lab

Cloud Lab

Cloud Lab

Cloud Lab

Cloud Lab

5.Integrating Bedrock with the AWS Ecosystem

Cloud Lab

Cloud Lab

Cloud Lab

6.Amazon Bedrock AgentCore and Production Agent Pipelines

Cloud Lab

7.Security and Responsible AI in Bedrock

Cloud Lab

Cloud Lab

8.Conclusion

Introduction to Bedrock Guardrails

Content filters and sensitivity thresholds