1. Degree centrality
Amazing work so far! You're progressing well, and I hope you've also been having fun with the coding challenges! We're now in Chapter 2, in which we'll explore ways of identifying nodes that are important in a network.
2. Important nodes
How might we tell what an important node is? In this chapter, there will be two concepts that I will introduce you to. The first is called "degree centrality", and the second is called "betweenness centrality". Let's take a look at
3. Important nodes
the following two "star graphs". If you looked at the center nodes, which center node of the two graphs might be considered "more important"?
4. Important nodes
If you chose the left graph, then you'd be very justified in doing so! In the majority of situations, we would consider the left graph's center node to be more important than the right graph's center node, because it is connected to more nodes. Being connected to other nodes means other nodes are considered a neighbor of that node. From the concept of neighbors, we can now introduce the concept of
5. Degree centrality
"degree centrality". The degree centrality metric is one of many metrics we can use to evaluate the importance of a node, and is simply defined as the number of neighbors that a node has divided by the total number of neighbors that the node could possibly have. There are two scenarios possible here: If self-loops are allowed, such as in a network mapping of all bike trips in a bike sharing system, then the number of neighbors that I could possibly have is every single node in the graph, including myself. On the other hand, if self-loops are not allowed, such as in the Twitter social network where, by definition, my account cannot follow itself, then the number of neighbors I could possibly have is every other node in the graph, excluding myself. In real life, examples of nodes in a graph that have high degree centrality might be: Twitter broadcasters, that is, users that are followed by many other users; or airport transportation hubs such as the New York, London or Tokyo airports; and finally, disease super-spreaders, who are the individuals that epidemiologists would want to track down to help stop the spread of a disease.
6. Number of neighbors
Suppose we had a star graph, in which node 1 was connected to every other node. NetworkX graphs expose a G-dot-neighbors method, which we can pass into a list constructor to get back a list of neighbors for a node. For example, if I do G-dot-neighbors(1), I will get back the neighbors of node 1. Likewise for node 8. If I pass in a node that doesn't exist in the graph, I'll get back a long error trace with a final line that reads, "NetworkX Error: The node [n] is not in the graph". NetworkX also provides
7. Degree centrality
the degree_centrality function, which takes in a graph object as an argument, and returns a dictionary in which the key is the node and the value is the degree centrality score for that node. With the degree centrality function, self-loops are not considered.
8. Let's practice!
Alrighty! Now that we've gone through the concepts and some toy code, let's move onto the exercises!