Search⌘ K
AI Features

Implementation and Integration I

Understand how to design and implement scalable, efficient Generative AI applications on AWS. Learn to use Amazon Bedrock for streaming responses, manage human-in-the-loop workflows with Step Functions, and deploy multi-agent systems with safeguards to optimize performance and reliability.

Question 21

A global e-commerce company is launching a “live assistant” chat feature in its mobile app, powered by Amazon Bedrock, to answer questions about orders, returns, and shipping.

The product team requires:

  • A typing-style experience (token-by-token response streaming).

  • First-token latency under 300 ms for most requests.

  • Automatic scaling during traffic spikes (for example, promotions).

  • Minimal infrastructure management.

  • End-to-end latency tracing when customers report slow responses.

Which solution best satisfies these requirements?

A. Use Amazon API Gateway (REST API) with AWS Lambda, invoking Amazon Bedrock, and return the full response only after inference completes.

B. Use Amazon API Gateway (WebSocket API) with AWS Lambda that invokes ...