Session Ready
Exercise

Find shared membership: Transposition

As you may have observed, you lose the metadata from a graph when you go to a sparse matrix representation. You're now going to learn how to impute the metadata back so that you can learn more about shared membership.

The user_matrix you computed in the previous exercise has been preloaded into your workspace.

Here, the np.where() function will prove useful. This is what it does: given an array, say, a = [1, 5, 9, 5], if you want to get the indices where the value is equal to 5, you can use idxs = np.where(a == 5). This gives you back an array in a tuple, (array([1, 3]),). To access those indices, you would want to index into the tuple as idxs[0].

Instructions
100 XP
  • Find out the names of people who were members of the most number of clubs.
    • To do this, first compute diag by using the .diagonal() method on user_matrix.
    • Then, using np.where(), select those indices where diag equals diag.max(). This returns a tuple: Make sure you access the relevant indices by indexing into the tuple with [0].
    • Iterate over indices and print out each index i of people_nodes using the provided print() function.
  • Set the diagonal to zero and convert it to a "coordinate matrix format". This code has been provided for you in the answer.
  • Find pairs of users who shared membership in the most number of clubs.
    • Using np.where(), access the indices where users_coo.data equals users_coo.data.max().
    • Iterate over indices2 and print out each index idx of people_node's users_coo.row and users_coo.col.