Session Ready
Exercise

Computing Pearson similarity

Recall that a correlation matrix measures the similarity of its entries. The correlation coefficient runs from -1, or maximum dissimilarity, to 1, maximum similarity, and values close to 0 indicate no correlation.

You can also use correlation matrices to find similarities between the nodes in the network.

The general idea is to associate each node with its column in the adjacency matrix. The similarity of two nodes is then measured as the correlation coefficient between the node columns.

Here we will use the Pearson correlation coefficient, which is the most common method of calculation.

For convenience, the adjancency matrix, A, has been created as a non-sparse matrix.

Instructions
100 XP
  • Use cor() to compute the Pearson correlation between the columns of the adjacency matrix A and save it to S.
  • Remove self similarity from S by setting the diagonal of S to 0 with diag().
  • Flatten S to be a vector using as.vector(), assigning to flat_S.
  • Plot a histogram of the similarities in matrix S with hist().