Session Ready
Exercise

Matrix sparsity

A common challenge with real-world ratings data is that most users will not have rated most items, and most items will only have been rated by a small number of users. This results in a very empty or sparse DataFrame.

In this exercise, you will calculate how sparse the movie_lens ratings data is by counting the number of occupied cells and compare it to the size of the full DataFrame. The DataFrame user_ratings_df that you have used in previous exercises, containing a row per user and a column per movie, has been loaded for you.

Instructions
100 XP
  • Count the number of non-empty cells in user_ratings_df and store the result as sparsity_count.
  • Count the total number of cells in the user_ratings_df DataFrame and store it as full_count.
  • Calculate the sparsity of the DataFrame by dividing the number of non-empty cells by the total number of cells and print the result.