Introduction

1. Introduction

To fully leverage the power of AI, robust and scalable infrastructure is paramount. This is where Google Kubernetes Engine becomes a critical component. GKE provides a managed environment for deploying, managing, and scaling containerized applications, making it ideal for the demanding workloads of AI. This module will explore how GKE facilitates the deployment and optimization of AI models, covering topics such as resource management, scaling strategies, and integration with Google Cloud's AI/ML services. In this section, titled CI/CD at Scale in GKE, you learn to explain how GKE serves as a suitable platform for large language models and the increasing demand for hardware accelerators, describe the high-level architecture of a GKE-based training platform for AI models, outline the architecture for a GKE-based model serving platform, and outline different cost management strategies available when using GKE for AI/ML workloads.

2. Let's practice!

Create Your Free Account

or

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.

This exercise is part of the course

Manage Scalable Workloads in GKE

AdvancedSkill Level

4.5+

Start Course for Free

In this introduction, you'll explore the course goals and preview each section.

Exercise 1: Course introduction

In the first section of this course titled, “Introduction to GKE at Scale,” you’ll learn to recognize the challenges of designing and building multi-environment solutions. Explain how GKE uses fleets to streamline operations. Describe the concepts of sameness and trust and use them to manage fleets. Finally you'll be able to identify the features and components used to manage GKE fleets.

Exercise 1: Introduction Exercise 2: Multi-cluster overview Exercise 3: GKE Fleets Exercise 4: Sameness and trust Exercise 5: GKE Fleet Management Exercise 6: Quiz Question 1 Exercise 7: Quiz Question 2 Exercise 8: Quiz Question 3 Exercise 9: Quiz Question 4 Exercise 10: Quiz Question 5

In this section of this course titled “Multi-cluster GKE Architecture”, you’ll learn to recognize how GKE can be used to centralize cluster management. Examine the architecture of GKE mulit-cluster clusters. Learn how to create, connect, and manage GKE fleets. You'll also learn how to securely access GKE fleet clusters.

Exercise 1: Introduction Exercise 2: Centralized Cluster Management Exercise 3: Multi-cluster GKE Exercise 4: Connect and Manage Fleet Clusters Exercise 5: Access GKE Fleet Clusters Exercise 6: Quiz Question 1 Exercise 7: Quiz Question 2 Exercise 8: Quiz Question 3 Exercise 9: Quiz Question 4 Exercise 10: Quiz Question 5 Exercise 11: Quiz Question 6

This section is about fleets and teams in GKE. You’ll learn how to define GKE fleets. Describe how GKE fleets can solve common cluster management problems. How to manage fleets and teams in GKE. You'll also be able to detail the elements of Fleet management.

Exercise 1: Introduction Exercise 2: What Are GKE Fleets?Exercise 3: Fleet Solutions Exercise 4: Fleet Team Management Exercise 5: Fleet Management Exercise 6: Manage Workloads at Scale with GKE Fleets and Teams Exercise 7: Quiz Question 1 Exercise 8: Quiz Question 2 Exercise 9: Quiz Question 3 Exercise 10: Quiz Question 4 Exercise 11: Quiz Question 5

In this section of the course, titled “Managing GKE configuration at Scale,” you’ll learn to recognize the challenges of scaling multi-cluster, multi-tenant configurations. Configure a centralized configuration management using GitOps model. Describe the benefits and architecture of Config Sync. Use Policy Controller to enforce security and compliance in GKE. Extend GitOps principles to Google Cloud resources. Finally you will learn how to create a standardized, reusable, and policy-driven foundation for Kubernetes deployments.

Exercise 1: Introduction Exercise 2: Configuration management challenges Exercise 3: Centralized configuration at scale Exercise 4: Config Sync Exercise 5: Policy Controller Exercise 6: Config Connector Exercise 7: Blueprints Exercise 8: Automate GKE Configurations with Config Sync Exercise 9: Quiz Question 1 Exercise 10: Quiz Question 2 Exercise 11: Quiz Question 3 Exercise 12: Quiz Question 4 Exercise 13: Quiz Question 5 Exercise 14: Quiz Question 6

In this section of the course, you’ll learn how to explain how fleet networking works. Describe how Pods in a Kubernetes cluster communicate with each other. Enable multi-cluster Services. Configure multi-cluster Services. Detail the elements of fleet management. Outline the role of a multi-cluster gateway. You'll also learn how to configure a multicluster gateway.

Exercise 1: Introduction Exercise 2: Fleet networking communications Exercise 3: Pods Discovery in GKE Exercise 4: Multi-cluster Services Exercise 5: Configuring multi-cluster Services Exercise 6: Multi-cluster gateway Exercise 7: Configuring multi-cluster gateways Exercise 8: Deploying a Multi-Cluster Gateway Across GKE Clusters Exercise 9: Quiz Question 1 Exercise 10: Quiz Question 2 Exercise 11: Quiz Question 3 Exercise 12: Quiz Question 4 Exercise 13: Quiz Question 5 Exercise 14: Quiz Question 6 Exercise 15: Quiz Question 7

In this section, titled, you learn to list and describe the benefits of using Cloud Service Mesh. Install and configure Cloud Service Mesh on different clusters. Trace the path of a request through the mesh, correctly identifying and explaining the role of key components like Envoy proxies, Mesh CA, and extensions in handling the request. Finally you'll learn to create Service Mesh Dashboards from workload telemetry including metrics, traces, and logs.

Exercise 1: Introduction Exercise 2: Introduction to Cloud Service Mesh Exercise 3: Provisioning Cloud Service Mesh Exercise 4: Handling requests with Cloud Service Mesh Exercise 5: Cloud Service Mesh Dashboards and Support Exercise 6: Installing Cloud Service Mesh on Google Kubernetes Engine Exercise 7: Quiz Question 1 Exercise 8: Quiz Question 2 Exercise 9: Quiz Question 3 Exercise 10: Quiz Question 4 Exercise 11: Quiz Question 5 Exercise 12: Quiz Question 6 Exercise 13: Quiz Question 7 Exercise 14: Quiz Question 8

In this section, titled, you learn to explain how Cloud Service Mesh learns the network from Kubernetes. Deploy mesh API resources such as the VirtualService, DestinationRule, Gateway, Service Entry, and the Sidecar to configure the mesh. Describe how to harden the mesh network by introducing new functionality such as request retries, request timeouts, and circuit breakers. You'll also learn how to explore Service Mesh resilience by creating failures and delays on specific services.

Exercise 1: Introduction Exercise 2: Configuring Cloud Service Mesh with Istio API Exercise 3: Configuring a VirtualService and DestinationRule Exercise 4: Configuring a ServiceEntry Exercise 5: Configuring a Gateway Exercise 6: Configuring a WorkloadEntry and WorkloadGroup Exercise 7: Network resilience and testing Exercise 8: Managing Traffic Flow with CSM Exercise 9: Quiz Question 1 Exercise 10: Quiz Question 2 Exercise 11: Quiz Question 3 Exercise 12: Quiz Question 4 Exercise 13: Quiz Question 5

In this section, you learn how to encrypt traffic between microservices to prevent anyone in the network from gaining access to private information. Authorize services and requests, ensuring that services only access the information that is allowed access from other services. Authenticate and authorize services and requests to verify trust among services in the mesh and among end users. You will also see how to limit service access in the network so that granular controls over the communication can be established.

Exercise 1: Introduction Exercise 2: Authentication and encryption Exercise 3: Service authentication in the mesh Exercise 4: End-user authentication in Cloud Service Mesh Exercise 5: Authorization in Cloud Service Mesh Exercise 6: Secure Cloud Service Mesh with Policy Controller and mTLS Exercise 7: Quiz Question 1 Exercise 8: Quiz Question 2 Exercise 9: Quiz Question 3 Exercise 10: Quiz Question 4 Exercise 11: Quiz Question 5

In this section, you learn to set up a multi-cluster mesh with a single subnet in a single VPC network and account for variations like multi-region clusters, multiple projects, shared VPC, and private clusters. You'll also learn to enable communication between GKE clusters on different networks using an east-west gateway and attached clusters.

Exercise 1: Introduction Exercise 2: Single network east-west routing Exercise 3: Multiple network east-west routing Exercise 4: Manage and Secure Distributed Services with GKE Managed Service Mesh Exercise 5: Quiz Question 1 Exercise 6: Quiz Question 2 Exercise 7: Quiz Question 3 Exercise 8: Quiz Question 4 Exercise 9: Quiz Question 5

In this section, you learn to summarize the differences between authentication methods for GKE clusters and explain when to use each. Summarize the key features of connect gateway and explain how it simplifies and secures connections to GKE Enterprise fleet member clusters. Configuring connect gateway for authentication and authorization. Securely access clusters and provide authentication using OpenID Connect (OIDC) and third-party identity providers (IdPs).Finally you'll learn to configure GKE Identity Service to enable authentication and authorization for users when given a GKE cluster and a third-party identity provider (IdP).

Exercise 1: Introduction Exercise 2: Introduction to GKE Identity Service Exercise 3: Connect gateway Exercise 4: Configuring connect gateway for authentication and authorization Exercise 5: Accessing clusters with GKE Identity Service Exercise 6: Authenticating third-party identities with GKE Identity Service Exercise 7: Fleet Workload Identity Exercise 8: Manage authentication at scale with Connect Gateway Exercise 9: Quiz Question 1 Exercise 10: Quiz Question 2 Exercise 11: Quiz Question 3 Exercise 12: Quiz Question 4 Exercise 13: Quiz Question 5 Exercise 14: Quiz Question 6

In this section, you learn to describe GKE security posture. Navigate and interpret the GKE security posture dashboard to identify security issues. Implement node security measures to protect GKE worker nodes from potential threats. Describe the process of vulnerability scanning in GKE. Finally you will be able to explain the roles and capabilities of Google Cloud's Artifact Analysis and Security Command Center in enhancing GKE security.

Exercise 1: Introduction Exercise 2: GKE security posture overview Exercise 3: Security posture dashboard Exercise 4: Implementing node security Exercise 5: Vulnerability scanning Exercise 6: Additional security services Exercise 7: Quiz Question 1 Exercise 8: Quiz Question 2 Exercise 9: Quiz Question 3 Exercise 10: Quiz Question 4 Exercise 11: Quiz Question 5 Exercise 12: Quiz Question 6 Exercise 13: Quiz Question 7

In this section, you learn to describe the core components of Google Cloud's CI/CD pipeline and how they address common challenges in application modernization. Analyze how Cloud Deploy integrates with GKE to manage Kubernetes manifests and control deployments. Compare and contrast the deployment strategies for Knative Serving within GKE Enterprise. Explain the steps required to establish a peered VPC connection for secure CI/CD in a private network. You will also learn how to evaluate the various security measures and tools available within Google Cloud for securing the software supply chain.

Exercise 1: Introduction Exercise 2: CI/CD in Google Cloud Exercise 3: Cloud Build and GKE Exercise 4: Cloud Deploy and GKE Exercise 5: Cloud Deploy: Policies,Deployments and Security Exercise 6: Cloud Run and Knative Serving Exercise 7: Cloud Deploy and Knative Serving Exercise 8: CI/CD in a private network Exercise 9: Securing the software supply chain Exercise 10: Creating CI/CD pipelines for GKE clusters Exercise 11: Quiz Question 1 Exercise 12: Quiz Question 2 Exercise 13: Quiz Question 3 Exercise 14: Quiz Question 4 Exercise 15: Quiz Question 5

In this section, you learn to explain how GKE serves as a suitable platform for large language models and the increasing demand for hardware accelerators. Describe the high-level architecture of a GKE-based training platform for AI models. Outline the architecture for a GKE-based model serving platform. You will also learn to outline different cost management strategies available when using GKE for AI/ML workloads.

Exercise 1: Introduction

Current Exercise

Exercise 2: AI and GKE overview Exercise 3: AI model training on GKE Exercise 4: AI model serving on GKE Exercise 5: AI cost management on GKE Exercise 6: Quiz Question 1 Exercise 7: Quiz Question 2 Exercise 8: Quiz Question 3 Exercise 9: Quiz Question 4 Exercise 10: Quiz Question 5

Student PDF links to all modules

Exercise 1: Module 0: Introduction Exercise 2: Module 1: Introduction to GKE at Scale Exercise 3: Module 2: Multi-cluster GKE Architecture Exercise 4: Module 3: Fleets and Teams Exercise 5: Module 4: Managing GKE configuration at Scale Exercise 6: Module 5: Fleet Networking Exercise 7: Module 6: Cloud Service Mesh Exercise 8: Module 7: Cloud Service Mesh routing Exercise 9: Module 8: Service mesh security Exercise 10: Module 9: Multi-cluster Networking with Cloud Service Mesh Exercise 11: Module 10: Manage Identity in GKE with Authentication Exercise 12: Module 11: Security Posture, Compliance, and Preventative Controls Exercise 13: Module 12: CI/CD at scale in GKE Exercise 14: Module 13: GKE and AI Exercise 15: Module 14: Course Summary