Amazon Bedrock is a service provided by AWS that allows you to leverage generative AI models without dealing with the complexities of training, hosting, and scaling. With Bedrock, you can focus on integrating powerful AI models into your applications and workflows, simplifying the process of incorporating AI into your projects.
In this Cloud Lab, you will configure two Amazon S3 buckets and use them with Amazon Bedrock to run a model evaluation job. One bucket will load the dataset as input, and the second will store the evaluation results as output. You will evaluate the performance of the generative AI model using Amazon Bedrock’s question-answer task type. The evaluation will focus on the accuracy metric, which measures how well the model’s responses align with the expected answers. The accuracy is calculated as the F1 score, which considers both precision (correctness of predictions) and recall (completeness of predictions).
In addition to the question-answer task type, Amazon Bedrock also supports several other task types, including:
General text generation
Text summarization
Text classification
After completing this Cloud Lab, you will have the skills to set up and run model evaluations using Amazon Bedrock. You’ll also learn to configure Amazon S3 buckets to support your AI workflows.
A high-level architecture diagram for this lab is given below: