To monitor and measure the performance of Amazon Bedrock requests, use AWS CloudWatch, custom logging, and AWS X-Ray. These tools provide visibility into metrics like response times, token usage, and error rates. Here’s how to implement them effectively:
1. Use AWS CloudWatch for Metrics and Alarms
Amazon Bedrock integrates with CloudWatch to track invocation metrics such as InvocationCount
, InvocationLatency
, and ErrorCount
. Enable these metrics in the Bedrock console or via the AWS SDK. For example, InvocationLatency
measures the time from sending a request to receiving a response. To track token usage (input/output tokens), check if your chosen model (e.g., Anthropic Claude) includes token counts in its API response. If supported, log these values as custom CloudWatch metrics using the PutMetricData
API. Set up CloudWatch Alarms to notify you when latency exceeds a threshold (e.g., 5 seconds) or error rates spike.
2. Implement Custom Logging and Tracing Log detailed request/response data by wrapping Bedrock API calls in code that records timestamps, tokens used, and errors. For instance, in Python:
import time
import boto3
client = boto3.client("bedrock")
start_time = time.time()
response = client.invoke_model(...)
latency = time.time() - start_time
# Log to CloudWatch Logs or a third-party tool
print(f"Latency: {latency}s, Tokens: {response['usage']['tokens']}")
Use AWS X-Ray to trace Bedrock requests in distributed systems. Instrument your code with the X-Ray SDK to visualize latency breakdowns and identify bottlenecks (e.g., slow model inference or network delays).
3. Analyze Error Rates and Costs
Monitor Bedrock’s ErrorCount
metric in CloudWatch, filtering by error types like ThrottlingException
or ModelTimeoutError
. Use AWS CloudTrail to audit API-level errors (e.g., permission issues). For cost tracking, correlate token usage with Bedrock’s pricing model (e.g., per 1,000 tokens). If token counts aren’t provided by the model, estimate them by counting characters in input/output text and dividing by the average tokens per character (varies by model). Use AWS Cost Explorer to track monthly expenses and validate against your logged metrics.
By combining CloudWatch, custom logging, and X-Ray, you gain actionable insights into Bedrock performance. Adjust alarms and logging based on your application’s specific thresholds (e.g., stricter latency requirements for real-time apps).