Find similar users

You're now going to build upon what you've learned so far to write a function called most_similar_users() that finds the users most similar to another given user.

The beginnings of this function have been written for you. A list of nodes, user_nodes has been created, which contains all of the users except the given user that has been passed into the function. Your task is to complete the function such that it finds the users most similar to this given user. You'll make use of your user_similarity() function from the previous exercise to help do this.

A dictionary called similarities has been setup, in which the keys are the scores and the list of values are the nodes. If you've never seen a defaultdict before, don't worry - you'll learn more about it in Chapter 3! It functions exactly like a regular Python dictionary.

This exercise is part of the course

Intermediate Network Analysis in Python

View Course

Exercise instructions

Iterate over user_nodes and compute the similarity between user and each user_node (n) using your user_similarity() function. Store the result as similarity.
Append the score and node to the similarities dictionary. The key is the score - similarity - and the value is the node - n.
Compute the maximum similarity score. To do this, first access the keys (which contain the scores) of similarities using the .keys() method and then use the max() function. Store the result as max_similarity.
Return the list of users that share maximal similarity. This list of users is the value of the max_similarity key of similarities.
Use your most_similar_users() function to print the list of users most similar to the user 'u4560'.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

from collections import defaultdict

def most_similar_users(G, user, user_nodes, proj_nodes):
    # Data checks
    assert G.nodes[user]['bipartite'] == 'users'

    # Get other nodes from user partition
    user_nodes = set(user_nodes)
    user_nodes.remove(user)

    # Create the dictionary: similarities
    similarities = defaultdict(list)
    for n in ____:
        similarity = ____(____, ____, ____, ____)
        ____[____].____

    # Compute maximum similarity score: max_similarity
    max_similarity = ____

    # Return list of users that share maximal similarity
    return ____[____]

user_nodes = get_nodes_from_partition(G, 'users')
project_nodes = get_nodes_from_partition(G, 'projects')

print(____)

Edit and Run Code

Intermediate Network Analysis in Python

AdvancedSkill Level

4.8+

64 reviews