Get startedGet started for free

Non-personalized suggestions

1. Non-personalized suggestions

While suggesting the highest-ranked items will generally return items that most people do not object to, it lacks any understanding of user tastes, or what items are liked by the same people.

2. Identifying pairs

In this video we will work through a third and final type of non-personalized recommendations, making suggestions by finding the most commonly seen together items. To do this we will use the same dataset as we previously used, with users and the books they read. We will record every time two books were read by the same person, and then count how often these pairings of books occur. We can then use this lookup table to suggest books that are often read by the same people, implying that if you like one, you are likely to enjoy the other.

3. Permutations versus combinations

We will be looking for all permutations of pairs, or in other words, counting both item_a paired with item_b and item_b paired with item_a separately. This will allow us to independently lookup items commonly seen with item_a, or items commonly seen with item_b.

4. Creating the pairing function

We will need to first create a function that finds all permutations of pairs of items in a list it is applied to and apply the function to the sets of books each user has read. Let's go through this code in steps.

5. Creating the pairing function

The permutations function from the itertools package takes a list as its first argument, in this case, the list of books a user has read and takes the length of the permutations as its second argument, two in this case as we care about what books are read together.

6. Creating the pairing function

We wrap this in a list as the permutation function returns an iterable object, while we want the actual list.

7. Creating the pairing function

This list is then wrapped in a DataFrame for ease of use.

8. Applying the function to the data

We can now apply this new function to our original DataFrame. DataFrame groupby objects have some built-in grouping functions, for example, dot mean, which you have used previously. However, custom functions are applied using the apply method. Here you can see the groupby being called on the book_df DataFrame with our custom function being applied using the apply method. This returns the correct data, but due to the groupby, it's a little difficult to read the nested index.

9. Cleaning up the results

We will get rid of this index by using the reset_index method. As we no longer need any of the information about the user_ids we can set the drop parameter to True. If we didn't do this, the index would just be converted to a column that we do not want.

10. Counting the pairings

Having all these individual pairs is a great start, but we want to know how often each of these pairs occurs with each other We use a DataFrame groupby method again, this time grouping by both columns, and using the size method on the resulting groups to find how often each combination occurs. We then can then use the to_frame method to convert this into a DataFrame for ease of use. Resetting the index once more to clean up the groupby index.

11. Looking up recommendations

Finally, we can sort the DataFrame so that we will find the most frequently seen together pairing for each book. With our newly sorted, cleaned up DataFrame we can find the most commonly paired book with it by filtering the book_a column by the name of the book we are looking for!

12. Let's practice!

Now it's your turn to try this out on the movie dataset you have been working with.