when() example
The when()
clause lets you conditionally modify a Data Frame based on its content. You'll want to modify our voter_df
DataFrame to add a random number to any voting member that is defined as a "Councilmember".
The voter_df
DataFrame is defined and available to you. The pyspark.sql.functions
library is available as F.
You can use F.rand()
to generate the random value.
This exercise is part of the course
Cleaning Data with PySpark
Exercise instructions
- Add a column to
voter_df
namedrandom_val
with the results of theF.rand()
method for any voter with the title Councilmember. - Show some of the DataFrame rows, noting whether the
.when()
clause worked.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Add a column to voter_df for any voter with the title **Councilmember**
voter_df = voter_df.____('random_val',
____(____, ____))
# Show some of the DataFrame rows, noting whether the when clause worked
voter_df.____