Infer and filter
Imagine you have a census dataset that you know has a header and a schema. Let's load that dataset and let PySpark infer the schema. What do you see if you filter on adults over 40?
Remember, there's already a SparkSession called spark in your workspace!
Questo esercizio fa parte del corso
Introduction to PySpark
Istruzioni dell'esercizio
- Load a JSON file
adults.json. - Filter the data to include adults over the
ageof40. - Show the results.
Esercizio pratico interattivo
Prova a risolvere questo esercizio completando il codice di esempio.
# Load the dataframe
census_df = spark.read.json("adults.json")
# Filter rows based on age condition
salary_filtered_census = census_df.____(census_df[____]____)
# Show the result
____