Get startedGet started for free

Find shared membership: Transposition

As you may have observed, you lose the metadata from a graph when you go to a sparse matrix representation. You're now going to learn how to impute the metadata back so that you can learn more about shared membership.

The user_matrix you computed in the previous exercise has been preloaded into your workspace.

Here, the np.where() function will prove useful. This is what it does: given an array, say, a = [1, 5, 9, 5], if you want to get the indices where the value is equal to 5, you can use idxs = np.where(a == 5). This gives you back an array in a tuple, (array([1, 3]),). To access those indices, you would want to index into the tuple as idxs[0].

This exercise is part of the course

Intermediate Network Analysis in Python

View Course

Exercise instructions

  • Find out the names of people who were members of the most number of clubs.
    • To do this, first compute diag by using the .diagonal() method on user_matrix.
    • Then, using np.where(), select those indices where diag equals diag.max(). This returns a tuple: Make sure you access the relevant indices by indexing into the tuple with [0].
    • Iterate over indices and print out each index i of people_nodes using the provided print() function.
  • Set the diagonal to zero and convert it to a "coordinate matrix format". This code has been provided for you in the answer.
  • Find pairs of users who shared membership in the most number of clubs.
    • Using np.where(), access the indices where users_coo.data equals users_coo.data.max().
    • Iterate over indices2 and print out each index idx of people_node's users_coo.row and users_coo.col.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

import numpy as np

# Find out the names of people who were members of the most number of clubs
diag = ____ 
indices = np.where(____ == ____)[0]  
print('Number of clubs: {0}'.format(diag.max()))
print('People with the most number of memberships:')
for i in indices:
    print('- {0}'.format(____))

# Set the diagonal to zero and convert it to a coordinate matrix format
user_matrix.setdiag(0)
users_coo = user_matrix.tocoo()

# Find pairs of users who shared membership in the most number of clubs
indices2 = np.where(____ == ____)[0]
print('People with most number of shared memberships:')
for idx in indices2:
    print('- {0}, {1}'.format(people_nodes[____.____[____]], people_nodes[____.____[____]]))  
Edit and Run Code