Find shared membership: Transposition
As you may have observed, you lose the metadata from a graph when you go to a sparse matrix representation. You're now going to learn how to impute the metadata back so that you can learn more about shared membership.
The user_matrix
you computed in the previous exercise has been preloaded into your workspace.
Here, the np.where()
function will prove useful. This is what it does: given an array, say, a = [1, 5, 9, 5]
, if you want to get the indices where the value is equal to 5
, you can use idxs = np.where(a == 5)
. This gives you back an array in a tuple, (array([1, 3]),)
. To access those indices, you would want to index into the tuple as idxs[0]
.
This exercise is part of the course
Intermediate Network Analysis in Python
Exercise instructions
- Find out the names of people who were members of the most number of clubs.
- To do this, first compute
diag
by using the.diagonal()
method onuser_matrix
. - Then, using
np.where()
, select those indices wherediag
equalsdiag.max()
. This returns a tuple: Make sure you access the relevant indices by indexing into the tuple with[0]
. - Iterate over
indices
and print out each indexi
ofpeople_nodes
using the providedprint()
function.
- To do this, first compute
- Set the diagonal to zero and convert it to a "coordinate matrix format". This code has been provided for you in the answer.
- Find pairs of users who shared membership in the most number of clubs.
- Using
np.where()
, access the indices whereusers_coo.data
equalsusers_coo.data.max()
. - Iterate over
indices2
and print out each indexidx
ofpeople_node
'susers_coo.row
andusers_coo.col
.
- Using
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
import numpy as np
# Find out the names of people who were members of the most number of clubs
diag = ____
indices = np.where(____ == ____)[0]
print('Number of clubs: {0}'.format(diag.max()))
print('People with the most number of memberships:')
for i in indices:
print('- {0}'.format(____))
# Set the diagonal to zero and convert it to a coordinate matrix format
user_matrix.setdiag(0)
users_coo = user_matrix.tocoo()
# Find pairs of users who shared membership in the most number of clubs
indices2 = np.where(____ == ____)[0]
print('People with most number of shared memberships:')
for idx in indices2:
print('- {0}, {1}'.format(people_nodes[____.____[____]], people_nodes[____.____[____]]))