Column operations - creating and renaming columns
The census dataset is still not quite showing everything you want it to. Let's make a new synthetic column by adding a new column based on existing columns, and rename it for clarity.
Remember, there's already a SparkSession called spark in your workspace!
Diese Übung ist Teil des Kurses
Introduction to PySpark
Anleitung zur Übung
- Create a new column,
"weekly_salary", by dividing the"income"column by 52. - Rename the
"age"column to"years". - Show the resulting DataFrame.
Interaktive Übung
Vervollständige den Beispielcode, um diese Übung erfolgreich abzuschließen.
# Create a new column 'weekly_salary'
census_df_weekly = census_df.____(____, ____)
# Rename the 'age' column to 'years'
census_df_weekly = ____.____(____, ____)
# Show the result
census_df_weekly.____