Getting started with Amazon CloudWatch

1. Getting started with Amazon CloudWatch

Welcome to this course on monitoring and troubleshooting AWS. I'm John, Principal Consultant at Broch Solutions.

2. Course overview

This course builds across three chapters. Chapter 1 covers the key terminology, what CloudWatch is, and how metrics and built-in dashboards surface issues. Chapter 2 moves into logs for deeper insight, alarm conditions, and sending alerts through SNS and SQS. Chapter 3 adds distributed tracing with AWS X-Ray and ties everything together in application health dashboards.

3. Prerequisites

A few things will help you get the most from this course. Be comfortable navigating the AWS Management Console and familiar with the core services we'll monitor: EC2, the virtual servers you run; Lambda, which runs your code without managing servers; and DynamoDB, AWS's managed NoSQL database. A little command-line and JSON familiarity rounds it out. You don't need to be an expert, just comfortable with the fundamentals.

4. Definitions

Let's clarify two terms that are often confused. Think of the car you drive. Monitoring is the dashboard warning light: it tells you something is wrong, an app has gone offline, a query is running too long. Observability is opening the hood to find out why, tracing the fault back to its root cause. Monitoring tells you the system broke; observability tells you why.

5. Monitoring and observability characteristics

This table lines the two up side by side. Their purpose differs first: monitoring detects known problems, observability helps you understand unknown ones. That drives the approach, predefined metrics and alerts for monitoring, open-ended exploratory analysis for observability. Monitoring asks "is it broken?", observability asks "why?". Monitoring is scoped to specific components; observability looks at end-to-end behavior. The data reflects that too: monitoring leans on metrics and logs, observability adds traces and events on top.

6. The troubleshooting workflow

Monitoring isn't just a set of features, it's an operational loop you run whenever something breaks. It starts when metrics and alarms detect a problem. You triage with dashboards to see scope and impact, then diagnose with logs and traces to find the cause. You ship a fix, then verify the system has recovered, often by watching the service map turn green. Keep this loop in mind, because it's what the whole course builds toward: chapters one and two give you detection and diagnosis, and chapter three adds tracing and the health dashboards that tie it together.

7. What is Amazon CloudWatch

At its core, Amazon CloudWatch is AWS’s native platform for monitoring and observability.

8. What is Amazon CloudWatch

It's a central place for both monitoring and observability, where you collect, view, and analyze operational data across your AWS environment, and it powers many of the monitoring experiences embedded in other AWS services.

9. What is Amazon CloudWatch

CloudWatch is made up of multiple capabilities you can use to meet your monitoring and observability needs as they grow and change.

10. What is Amazon CloudWatch

This includes event monitoring, tracked over time or as individual events. For applications spanning multiple AWS accounts all of this can be centrally managed with cross account monitoring.

11. What is Amazon CloudWatch

For applications spanning multiple AWS accounts, all of this can be centrally managed with cross-account monitoring.

12. What is Amazon CloudWatch

It also captures and analyzes logs, collects metrics across your services, and brings them together in dashboards.

13. What is Amazon CloudWatch

Finally, rules and alarms build on all of these to trigger activity in response to a situation.

14. The CloudWatch console

The CloudWatch console is where we create dashboards, view metric and log data, and configure CloudWatch capabilities.

15. CloudWatch metrics

From the AWS console, you can navigate to CloudWatch and select All Metrics in the left panel to browse everything collected. Let's look at DynamoDB table metrics, specifically consumed read and write capacity. Expand the time window, spot the spike, then zoom in to examine it.

16. Custom metrics

Custom metrics let you capture application-specific performance and behavior data. Publish them via the CLI (the command-line interface), a REST API, or an SDK (a software development kit for your language). Both examples use the put-metric-data command. The first publishes an ActiveUsers count with two dimensions, InstanceId and InstanceType, so we can filter and group it later; you can attach up to thirty per metric. The second sends several data points in one go, each with its own timestamp, which is how you batch or backfill readings. The takeaway: anything that can call the CLI can publish a metric, and dimensions are what make it searchable.

17. Let's practice!

Enough talk! Let's dive into exploring the AWS console.

Create Your Free Account

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.