Get startedGet started for free

Configuring AWS Lambda for production

1. Configuring AWS Lambda for production

Welcome back. In the last chapter we shaped the application and connected it with events. Now we make the compute production-ready, starting with AWS Lambda, which runs your code without managing servers. In this video, you'll learn how to configure Lambda memory, timeout, and concurrency, package shared code with layers and extensions, connect a function to a private VPC, and handle failures with destinations and dead-letter queues. Let's get started.

2. It worked in test

The function passed every test. In production it starts timing out, then throttling, and the team scrambles. Nothing in the code changed; the configuration was never tuned for real load. Production-readiness for Lambda is mostly configuration, and this video walks the settings that decide whether it scales.

3. Lambda configuration that matters in production

Four settings decide whether a Lambda survives production load. Memory is the big one: raising it also raises CPU and network proportionally, so a function can get faster and cheaper at higher memory. Timeout sets how long it may run before Lambda kills it. Ephemeral storage is temporary scratch space in /tmp. Concurrency controls how many copies run at once. Get these right and it scales; get them wrong and it throttles, times out, or overspends.

4. Memory sizing and the CPU link

The single most misunderstood Lambda setting is memory. People treat it as just RAM, but Lambda allocates CPU and network bandwidth in proportion to memory. So a CPU-bound function set to 128 megabytes might run slowly, while the same function at 1 gigabyte finishes several times faster. Because you pay for duration, faster can actually be cheaper, even at a higher memory price per millisecond. Trace the curve on the right: duration drops steeply as you add memory, then flattens, while total cost sinks to a low point before climbing again. That bottom of the U is the setting you want. The takeaway: you right-size memory by measuring duration at a few settings, not by guessing the smallest number saves money.

5. Concurrency: reserved vs provisioned

Concurrency comes in two flavors that solve opposite problems. Reserved concurrency caps how many instances run at once, protecting a fragile downstream database or stopping one function from starving the account. Provisioned concurrency does the reverse: it keeps instances initialized and warm, so they respond with no cold start, the delay when AWS spins up a fresh instance, at a cost you reserve for latency-sensitive, user-facing functions. Reserved caps the maximum; provisioned guarantees a warm minimum.

6. Packaging configuration and shared code

You do not bake configuration into your code. Environment variables hold per-environment values like a table name or a feature flag, so the same artifact runs in dev and prod. Lambda layers package shared libraries and code that many functions reuse, so you upload a dependency once instead of bundling it into every function. Extensions run alongside your function to add capabilities like monitoring agents or secrets caching. Together these keep your deployment package small, your configuration external, and your shared code in one place.

7. Connecting Lambda to a VPC

To reach private resources, like a database in a private subnet, you connect the function to your VPC, your own private network inside AWS. Lambda creates elastic network interfaces, or ENIs, in the subnets you choose, and security groups control what it can talk to, like an EC2 instance. One catch: in a VPC the function loses default internet access, so for external calls you add a NAT gateway or VPC endpoint. VPC access is powerful but adds networking you must plan for.

8. Handling failures with destinations and DLQs

Asynchronous invocations fail sometimes, and you design for it so nothing is lost. By default, async invokes retry automatically. For richer handling, Lambda Destinations route the result: successful events one place, failed events another, like an SQS queue or an EventBridge bus. A dead-letter queue is the simpler, older mechanism: it catches events that exhaust all retries so you can inspect them later. The rule is that a failed event should always land somewhere you can see it, never disappear silently.

9. Let's practice!

Time to tune a function for production. Let's practice!

Create Your Free Account

or

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.