Get startedGet started for free

Finding communities

1. Finding communities

One of the most common things you'll want to do with this type of social data is find "communities". Communities are a natural way to think of a mention graph. You're trying to algorithmically find groups of people who talk to each other, more than they talk to any other group of people. Recall that in the first lesson it was covered that there are a number of methods for doing this. Here what we're going to focus on is how you can understand the difference between these methods.

2. Three different communities

Let's find communities using three different methods in our mention graph. Here we're going to use the edge betweenness, leading eigen vector, and label propagation methods.

3. Sizing the communities

First we're going to just look at the number and sizes of the communities that were found. If we look at the length of each community object, it tells us the number of communities found. In this case, label propagation found the largest number of communities (212).

4. Sizing the communities (2)

We can use the sizes() function to get the size of each community and use the table() function to see how many there are. In this case each algorithm found 103 communities of size 2, but then things start to diverge. The label propagation algorithm found 19 communities of size five, but the other two found only 7. So how can we compare these?

5. Comparing communities

The compare() function allows us to measure similarity between community structures. Here we're using the variance in information metric. Essentially it says how much variation is there in community membership for each vertex. The closer the number is to 0, the more likely it is that any two vertices are to be found in the same community as determined by each algorithm. In this case, the eigen vector method and label propagation are the least similar.

6. Plotting community structure

Now we'll just look at the leading eigen vector communities. First we'll get the names of communities with 45 or more members. Next we'll create a subgraph of those, and finally plot our subgraph, coloring each vertex by community membership.

7. Plotting community structure

Our plot shows four distinct communities, more or less colored how we might expect.

8. Let's practice!

Great, now that you've seen how to compare communities and visualize subgraphs, let's do some more detailed comparisons and build some more complex sub graphs.