Stepping through K-nearest neighbors
You have just seen how K-nearest neighbors can be used to infer how someone might rate an item based on the wisdom of a (similar) crowd. In this exercise, you will step through this process yourself to ensure a good understanding of how it works.
To get you started, as you have generated similarity matrices many times before, that step has been done for you with the user similarity matrix wrapped in a DataFrame loaded as user_similarities
.
This has each user as the rows and columns, and where they meet the corresponding similarity score.
In this exercise, you will be working with user_001
's similarity scores, find their nearest neighbors, and based on the ratings those neighbors gave a movie, infer what rating user_001
might give it if they saw it.
This exercise is part of the course
Building Recommendation Engines in Python
Exercise instructions
- Find the IDs of
User_A
's 10 nearest neighbors by extracting the top 10 users inordered_similarities
and storing them asnearest_neighbors
. - Extract the ratings the users in
nearest_neighbors
gave fromuser_ratings_table
asneighbor_ratings
. - Calculate the average rating these users gave to the movie
Apollo 13 (1995)
to infer whatUser_A
might give it if they had seen it.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Isolate the similarity scores for user_1 and sort
user_similarity_series = user_similarities.loc['user_001']
ordered_similarities = user_similarity_series.sort_values(ascending=False)
# Find the top 10 most similar users
nearest_neighbors = ordered_similarities[1:11].____
# Extract the ratings of the neighbors
neighbor_ratings = user_ratings_table.____(nearest_neighbors)
# Calculate the mean rating given by the users nearest neighbors
print(neighbor_ratings['Apollo 13 (1995)'].____())