Get startedGet started for free

Deploy an ETL pipeline on Kubernetes

You will deploy an ETL pipeline on Kubernetes. Your Extract, Transform, and Load steps will be realized by Pods, which read and write to respective Persistent Volumes that Persistent Volume Claims create.

Your task is to find the total number of passengers that took a NYC yellow cab as a group of 2 or more. Your "Extract Pod" will prepare initial data as a CSV file, and hand it over to the "Transform Pod". This Pod will refine the yellow cab data into an SQLite database, select all the data that is necessary for the final computation, and hand it over to the "Load Pod". This final Pod will sum all the passenger data, present it, and save it as a CSV file.

All of these steps will be performed using the standard Kubernetes objects that you know. There have been two directories prepared, "Docker/" and "Manifests/", which hold the necessary files to create the Docker images and deploy them using Kubernetes.

This exercise is part of the course

Introduction to Kubernetes

View Course

Hands-on interactive exercise

Turn theory into action with one of our interactive exercises

Start Exercise