LoslegenKostenlos loslegen

Infer and filter

Imagine you have a census dataset that you know has a header and a schema. Let's load that dataset and let PySpark infer the schema. What do you see if you filter on adults over 40?

Remember, there's already a SparkSession called spark in your workspace!

Diese Übung ist Teil des Kurses

Introduction to PySpark

Kurs anzeigen

Anleitung zur Übung

  • Load a JSON file adults.json.
  • Filter the data to include adults over the age of 40.
  • Show the results.

Interaktive Übung

Vervollständige den Beispielcode, um diese Übung erfolgreich abzuschließen.

# Load the dataframe
census_df = spark.read.json("adults.json")

# Filter rows based on age condition
salary_filtered_census = census_df.____(census_df[____]____)

# Show the result
____
Code bearbeiten und ausführen