Get startedGet started for free

Spark, Hadoop and Hive

You've encountered quite a few open source projects in the previous video. There's Hadoop, Hive, and PySpark. It's easy to get confused between these projects.

They have a few things in common: they are all currently maintained by the Apache Software Foundation, and they've all been used for massive parallel processing. Can you spot the differences?

This exercise is part of the course

Introduction to Data Engineering

View Course

Hands-on interactive exercise

Turn theory into action with one of our interactive exercises

Start Exercise