CommencerCommencer gratuitement

Loading census data

Let's start creating your first PySpark DataFrame! The file adult_reduced.csv contains a grouping of adults based on a variety of demographic categories. These data have been adapted from the US Census. There are a total of 32562 groupings of adults.

We should load the csv and see the resulting schema.

Data dictionary:

Variable Description
age Individual age
education_num Education by degree
marital_status Marital status
occupation Occupation
income Categorical income

Cet exercice fait partie du cours

Introduction to PySpark

Afficher le cours

Instructions

  • Create a PySpark DataFrame from the"adult_reduced.csv" file using the spark.read.csv() method.
  • Show the resulting DataFrame.

Exercice interactif pratique

Essayez cet exercice en complétant cet exemple de code.

# Read in the CSV
census_adult = ____.____.____(____)

# Show the DataFrame
census_adult.____
Modifier et exécuter le code