Normalize your data
Before you can find the factors of the ratings matrix using singular value decomposition, you will need to "de-mean", or center it, by subtracting each row's mean from each value in that row.
In this exercise, you will begin prepping the movie rating DataFrame you have been working with in order to be able to perform Singular value decomposition.
user_ratings_df
contains a row per user and a column for each movie and has been loaded for you.
This exercise is part of the course
Building Recommendation Engines in Python
Exercise instructions
- Find the average rating each user has given across all the movies they have seen and store these values as
avg_ratings
. - Subtract the row averages from their respective rows and store the result as
user_ratings_centered
. - Finally, fill in all missing values in
user_ratings_centered
with zeros. - Print the average of each column in
user_ratings_centered
to show they have been de-meaned.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Get the average rating for each user
avg_ratings = user_ratings_df.____(axis=1)
# Center each user's ratings around 0
user_ratings_centered = user_ratings_df.____(____, axis=1)
# Fill in all missing values with 0s
user_ratings_centered.____(0, inplace=True)
# Print the mean of each column
print(user_ratings_centered.____(axis=1))