AWS CloudWatch for LLM Monitoring: Track Metrics and Set Alerts for Bedrock Models

Ready Tensor · Intermediate ·🧠 Large Language Models ·1mo ago
In this video, we walk through how to use AWS CloudWatch to monitor and alert on Large Language Models deployed on Amazon Bedrock. You’ll learn how to observe default LLM metrics like input tokens, output tokens, and invocation latency, and how to extend CloudWatch with custom metrics such as Time To First Token (TTFT). We also cover setting up alarms and email notifications so you can proactively catch issues in production. You’ll learn how to: Monitor Bedrock LLMs using CloudWatch default metrics Track input tokens, output tokens, and invocation latency Add a custom CloudWatch metric fo…
Watch on YouTube ↗ (saves to browser)

Chapters (8)

Overview of CloudWatch for LLM monitoring
0:44 Default Bedrock metrics: tokens and latency
1:22 Tracking Time To First Token (TTFT) with custom metrics
2:27 Viewing metrics by model ID in CloudWatch
3:35 Understanding averages, sums, and P99 metrics
4:33 Adding custom TTFT metrics to the dashboard
5:28 Creating alarms on LLM metrics
7:29 Triggering and validating alerts with SNS
5 Levels of AI Agents - From Simple LLM Calls to Multi-Agent Systems
Next Up
5 Levels of AI Agents - From Simple LLM Calls to Multi-Agent Systems
Dave Ebbelaar (LLM Eng)