Updating Deployments

1. Updating Deployments

Let’s say you recently updated a container image with new and exciting features. You want your Pods to benefit from the improvements, so you need to update your Pod’s specification with the new version of the image. This is an example of a deployment update. Deployment updates are common, and they do not affect the availability of other applications and services in the cluster. There are a few different ways to update a Deployment. The first option is to use the “kubectl apply command” with an updated Deployment specification YAML file. This command lets you update Deployment specifications, such as the number of replicas, outside the Pod template. The next option is the “kubectl set command,” which lets you make changes to Pod template specifications, like the image, resources, or selector values. Next is the “kubectl edit command,” which lets you make changes directly in the specification file. Vim, which is an open-source, screen-based text editor, can be used to open the file. After you save your changes, kubectl will automatically apply the updates. And finally, you can also update Deployments by using the Google Cloud console. Now, imagine updating your app without anyone noticing. That’s the magic of a rolling update, also known as a “ramped strategy.” When a Deployment is updated, a new set of Pods are launched in a new ReplicaSet. Then, after the new Pods start running smoothly, the old Pods in the outdated ReplicaSet are gracefully retired. GKE updates the Pods in a Deployment one at a time, meaning there’s always at least one Pod running the old version of your application and there won’t be any downtime or disruptions. So, how is a rolling update configured? There are two primary parameters used to control the speed of rolling updates: “maxSurge” and “maxUnavailable.” “maxSurge” specifies the maximum number of extra Pods that can be simultaneously running on the new version. “maxUnavailable,” in contrast, specifies the maximum number of Pods that can be unavailable at the same time. The maximum values can either be absolutes or percentages. For example, if you set maxSurge to 1 and maxUnavailable to 0, Kubernetes will update one Pod at a time, without any downtime. To explain this further, let’s work through an example of a rolling update. A Deployment has a desired number of Pods set to 10, with the maxUnavailable parameter set to 10% and the maxSurge parameter set to 5. The old ReplicaSet has 10 Pods. The Deployment will begin by creating 5 new Pods in a new ReplicaSet based on the maxSurge parameter. When those new Pods are ready, the total number of Pods will change to 15. Since the maxUnavailable parameter is set to 10%, the minimum number of Pods that can run, regardless of whether they are from the old or new ReplicaSet, is 10 minus 10%. This equals at least a minimum of 9 Pods. Therefore 6 of the 15 Pods can be removed from the old ReplicaSet, leaving a minimum of 9: 5 in the new ReplicaSet and 4 in the old ReplicaSet. Next, an additional 5 Pods will be launched on the new ReplicaSet, totalling 10 Pods in the new ReplicaSet and 14 in all ReplicaSets. Finally, the remaining 4 Pods in the old ReplicaSet will be deleted. The old ReplicaSet will be retained for rollback even though it’s empty, and this will leave 10 Pods in the new ReplicaSet. Cluster resources, like CPU and memory, will change as rolling updates occur. To control these resources, you can use requests and limits. Requests and limits can be set manually, or you can have a Pod set them for you. Requests determine the minimum amount of CPU and memory that a container will be allocated on a node. The scheduler will only assign a container to a node if that node has enough available resources to meet the container's requests. If you specify a CPU or memory value that is larger than the available resources on your nodes, your pod will never be scheduled. Limits can set the upper boundary of how much CPU and memory a container can use on a node. This prevents one container from consuming excessive resources and affecting other containers or critical node processes. Please note that the limit cannot be lower than the request. Requests and limits must be for each individual container in the Pod. Pods, however, are scheduled as a group, so you still need to calculate the total resource requests and limits by adding up the values from each individual container within the Pod. To manage the requests and limits as they come into a cluster, Kubernetes uses the kube-scheduler to decide which node to place the Pod on. If the optimal node cannot be found for the request, it is sent back to the kube-scheduler to try again. Once the optimal node is found, the kubelet is used to enforce resource limits and ensure that the running container is not allowed to use more of that resource than the limits set. The kubelet is also responsible for reserving request amounts for containers. CPU resources are measured in millicores. For example a container that needs two full cores to run would be measured as 2000m (millicores). A container that needs one quarter of a core, would be measured as 250m.

2. Let's practice!

Create Your Free Account

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.