Optimizing performance

1. Optimizing performance

Welcome back! we learned about Application Map in the last video. Let's now learn about optimizing performance for Azure solutions.

2. Performance is a feature

Performance isn’t just about speed, it’s about user trust. Slow applications lose users, increase costs, and amplify failures under load. In this lesson, we focus on how Azure developers optimize applications for low latency and high traffic at scale.

3. Where performance problems come from

Most performance issues fall into a few categories: cold starts, network latency, slow dependencies like databases or APIs, and resource saturation under load. Understanding these categories helps you choose the right optimization technique.

4. Latency vs throughput

Latency is how long a single request takes. Throughput is how many requests you can handle at once. Optimizing for one doesn’t always optimize the other.

5. Optimize the critical path

Most user-facing latency comes from the critical path which is the slowest sequence of calls required to serve a request. Performance optimization starts by identifying and shortening this path, not by tuning everything equally.

6. Reducing dependency latency

Dependencies are the most common source of latency. Techniques include query optimization, connection pooling, batching requests, and reducing chatty calls. Dependency optimization often delivers bigger gains than scaling compute.

7. Caching

Caching removes repeated work. Instead of recalculating data, the app serves responses directly from memory. This reduces latency and protects downstream services during traffic spikes.

8. Azure Cache for Redis

Azure Cache for Redis is a fully managed, in-memory data store optimized for rapid access. It’s commonly used for session state, frequently accessed data, and computed results. Because it’s managed, you get high availability, scaling, and security without operating Redis yourself. Azure Cache for Redis is being retired in 2028 for Azure managed redis.

9. Azure Managed Redis

Azure Managed Redis is the new, improved version of Azure Cache for Redis, designed for higher performance and scalability. Each node runs multiple Redis servers in parallel, making more efficient use of CPU resources. With more instances distributed across nodes, it allows more to run concurrently without overloading a single machine. A high-performance proxy manages connections, routes requests, and enables self-healing, making AMR ideal for high-traffic, low-latency applications.

10. Cache design considerations

Caching introduces design decisions. You must define expiration policies and choose consistency over freshness when necessary. Common patterns include cache-aside, where the application loads data into the cache only when needed, and cache invalidation, which ensures outdated or changed data is refreshed or removed.

11. Throttling and backpressure

Under extreme load, accepting every request can cause cascading failures. Throttling and backpressure protect your system by slowing or rejecting excess traffic gracefully, preserving overall availability.

12. Asynchronous and queue-based patterns

Offloading work to queues decouples user requests from long-running tasks. Services like Azure Service Bus or Storage Queues absorb traffic spikes and allow your system to process work at a controlled rate.

13. The business problem

A retail company launches a flash sale promoted across social media. Within minutes, traffic triples. The application doesn’t crash, but pages become slow, checkout latency increases, and users start abandoning their carts. The business impact is immediate lost revenue during peak demand.

14. Finding the bottleneck

Investigation shows that every product page request hits the database for the same catalog data. Under high traffic, the database becomes the bottleneck, increasing response times across the app. Scaling compute alone won’t fix the problem because the dependency remains slow.

15. Optimizing for scale and speed

The team adds Azure Cache for Redis to serve frequently accessed data in memory, while autoscale and background processing handle increased load. Latency drops, checkout stabilizes, and peak-time revenue is captured.

16. Let's practice!

Let's jump in and get hands-on!

Create Your Free Account

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.