Get startedGet started for free

Lesson 4.1 Video

1. X-Ray concepts and architecture

We've covered metrics, logs, and alarms. Now we move into distributed tracing with AWS X-Ray: what it is, how it traces requests across services, and what makes up a trace.

2. The challenge of distributed systems

Debugging a monolith is straightforward: the flow is linear, contained in one process. Distributed systems are different. A single request might touch multiple Lambda functions, databases, and external APIs. When it fails, which service caused it? It's like untangling a knot where twenty threads are tied together. That's what X-Ray solves.

3. What is AWS X-Ray?

X-Ray follows a request through every service it touches, measuring time at each step and detecting errors wherever they occur. It visualizes the flow as a service map, a live diagram of your architecture. Without it, diagnosing a slow request means correlating logs across services by hand; with X-Ray, you see the complete picture.

4. Five core capabilities

X-Ray has five capabilities: request tracing for end-to-end visibility, performance analysis with latency percentiles, error detection that pinpoints failures, the service map with color-coded health, and Insights for automatic anomaly detection. Together they answer the hard questions: where's the bottleneck, where did the error start, what's the end-to-end latency?

5. How tracing works

Each service creates a segment tagged with the same trace ID, like a parcel tracking number that every warehouse scans, giving you the whole journey in one place. When one service calls another, it passes that ID in an HTTP header. Segments go to the local daemon, which forwards them, and X-Ray assembles them into one unified trace.

6. Trace ID and header propagation

The trace ID packs a version, a Unix timestamp in hexadecimal, and a 96-bit identifier, and propagates in the X-Amzn-Trace-Id header. Root is the trace ID, identical across the chain; Parent builds the call graph; Sampled tells downstream services whether to trace the request. So how does this reach X-Ray?

7. The X-Ray daemon

The X-Ray daemon listens on UDP port 2000. Your app sends segments to it via the SDK; the daemon batches them, forwards them to the X-Ray API, and handles retries, decoupling your app from the service. It runs automatically on Lambda, as a service on EC2, and as a sidecar on ECS.

8. Sampling

Sampling controls how much data you record, keeping what's actionable without overspending. It's like a quality inspector on a production line: they don't check every item, just a representative sample, and problems still show up. The default rule traces the first request each second plus 5% of the rest. Custom rules set three fields, fixed_target, rate, and priority; a typical strategy samples 100% of 5xx errors and 5% of normal traffic.

9. Segments

A segment is the fundamental building block of a trace. It records service name, timestamps, trace ID, HTTP details, and AWS region. There are five states: in-progress, ok for success, error for 4xx, fault for 5xx, and throttle for 429. These are color-coded in the console, making problems easy to spot.

10. Subsegments

Subsegments give granular timing within a segment. Without them you only know the total duration; with them you can see which downstream call is the bottleneck. Three namespaces: aws for AWS service calls, remote for external HTTP, and local for custom code. The SDK creates aws and remote subsegments automatically; local ones you create manually.

11. Annotations

Annotations are indexed key-value pairs, searchable in the console, like labels on filing cabinet drawers that help you find the right folder fast. Filter by user_id, version, or environment. The limit is 50 indexed annotations per trace, simple types only.

12. Metadata

Unlike annotations, metadata is not indexed, has no size limit, and can be any JSON structure. Use it for request bodies, error details, and business data. Annotations are the label on the drawer, how you find the folder; metadata is what's inside, the detail you read once you've found it. Annotations are your WHERE clause, metadata is your SELECT clause.

13. Complete trace structure

Here the trace has four segments: API Gateway, OrderService, PaymentService, and InventoryService. The external payment API at 350 milliseconds is the slowest operation, so it's your optimization target. Annotations like user_id make the trace searchable; metadata holds the detailed business data.

14. Lesson summary

To recap: X-Ray solves the challenge of debugging distributed systems through trace ID propagation, the lightweight daemon, and configurable sampling. Traces have four components: segments, subsegments, annotations, and metadata. Next, we implement X-Ray tracing in code.

15. Let's practice!

Let's take a closer look at X-Ray concepts.

Create Your Free Account

or

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.