CommencerCommencer gratuitement

Load in the data

Reading in data is the first step to using PySpark for data science! Let's leverage the new industry standard of parquet files!

Cet exercice fait partie du cours

Feature Engineering with PySpark

Afficher le cours

Instructions

  • Use the parquet() file reader to read in 'Real_Estate.parq' as described in the video exercise.
  • Print out the list of columns with columns.

Exercice interactif pratique

Essayez cet exercice en complétant cet exemple de code.

# Read the file into a dataframe
df = spark.read.____(____)
# Print columns in dataframe
____(df.____)
Modifier et exécuter le code