Exercise

SQL queries for filtering Table

In the previous exercise, you have run a simple SQL query on a DataFrame. There are more sophisticated queries you can construct to obtain the result that you want and use it for downstream analysis such as data visualization and Machine Learning. In this exercise, we will use the temporary table people that you created previously and filter out the rows where the "sex" is male and female and create two DataFrames.

Remember, you already have a SparkSession spark and a temporary table people available in your workspace.

Instructions

100 XP
  • Filter the people table to select all rows where sex is female into people_female_df DataFrame.
  • Filter the people table to select all rows where sex is male into people_male_df DataFrame.
  • Count the number of rows in both people_female and people_male DataFrames.