1. Introduction to Networks
Hi! My name is Eric, and I am a Data Scientist working at the intersection of biological network science and infectious disease. I'm thrilled to share with you my knowledge on how to do network analytics. I hope we'll have a fun time together!
2. Networks!
Let me first ask you a question: what are some examples of networks? Well, one example might be a social network! In a social network, we are modeling the relationships between people. Here's another one - transportation networks. In a transportation network, we are modeling the connectivity between locations, as determined by roads or flight paths connecting them. At its core, networks are a useful tool for modeling relationships between entities.
3. Networks!
By modeling your data as a network, you can end up gaining insight into what entities (or nodes) are important, such as broadcasters or influencers in a social network. Additionally, you can start to think about optimizing transportation between cities. Finally, you can leverage the network structure to find communities in the network. Let's go a bit more technical.
4. Network Structure
Networks are described by two sets of items: nodes
5. Network Structure
and edges. Together, these form a "network",
6. Network Structure
otherwise known in mathematical terms as a "graph". Nodes and edges can have metadata associated with them. For example, let's say there are two friends, Hugo and myself, who met on the 21st of May, 2016. In this case,
7. Network Structure
the nodes may be "Hugo" and myself, with metadata stored in a key-value pair as "id" and "age". The friendship is represented as a line between the two nodes, and may have metadata such as "date", which represents the date on which we first met. In the Python world,
8. NetworkX API Basics
there is a library called NetworkX that allows us to manipulate, analyze and model graph data. Let's see how we can use the NetworkX API to analyze graph data in memory. NetworkX is typically imported as nx. Using nx-dot-Graph, we can initialize an empty graph to which we can add nodes and add edges. I can add in the integers 1, 2, and 3 as nodes, using the add_nodes_from method, passing in the list [1, 2, 3] as an argument. The Graph object G has a dot-nodes method that allows us to see what nodes are present inside the graph, and returns a view of the nodes. If we add an edge between the nodes 1 and 2, we can then use the G-dot-edges method to return a view of the edges as tuples, in which each tuple shows the nodes that are present on that edge.
9. NetworkX API Basics
Metadata can be stored on the graph as well. For example, I can add to the node '1' a 'label' key with the value 'blue', just as I would assign a value to the key of a dictionary. I can then retrieve the node list with the metadata attached using G-dot-nodes, passing in the data equals True argument. What this returns is a list of 2-tuples, in which the first element of each tuple is the node, and the second element is a dictionary in which the key-value pairs correspond to my metadata. NetworkX also provides basic drawing functionality,
10. NetworkX API Basics
using the nx-dot-draw function. nx-dot-draw takes in a graph G as an argument. In the IPython shell, you will also have to call the plt-dot-show function in order to display the graph to screen. With this graph, the nx-dot-draw function will draw to screen what we call a node-link diagram rendering of the graph.
11. Let's practice!
The first set of exercises we'll be doing here is essentially exploratory data analysis on graphs. Alright, let's go on and take a look at the exercises!