1. Importing and visualizing Twitter networks
Now that you have a good idea of the different kinds of networks in Twitter data, now we can begin importing data into Python, as well as constructing and visualizing networks.
2. Edge Lists
One of the most common methods of constructing a network is by creating what's called an *edge list*, which is a list of all the edges between nodes. Luckily, in the way that we've formed our tweet datasets, we already have a way of easily retrieving the edge lists.
3. Importing a retweet network
We first import the `networkx` package, a full-featured package for network analysis in Python. We'll then flatten the tweets and load them into a pandas data frame. Lastly, we use the networkx method `from_pandas_edgelist` to construct a network. This methods takes four arguments -- the data frame, the source, which in this case is the user's screen name, the target, which is the retweeted user's screen name, and lastly `created_using` to specify that this is a directed graph. We specify this by passing in a `DiGraph` object.
4. Importing a quoted network
Importing a quoted status network is very similar. We flatten the tweets and convert the JSON to a data frame like before. The largest change is that we replace the target argument with the quoted status's user screen name.
5. Importing a reply network
Importing a reply network is similar but slightly different. We use a new field as the target argument: `in_reply_to_screen_name`.
6. Visualization
Visualization is a standard part of exploratory data analysis and this is no different for networks. With this randomly generated network stored in the variable T, we use the method `draw_networkx` to visualize the network. We also turn off the axes in `matplotlib` so we don't have axis lines around it. The default visualization has red nodes and displays labels for each of the nodes. Nodes are also all the same size. We can change each of those properties, however.
7. Visualization options
In this version of the visualization, we have set a few more arguments in the `draw_networkx` method. We have changed the size of the nodes by using a list comprehension to get the degree -- or the number of edges the node has -- and using them to create a list of size values. We also removed the labels by setting the `with_labels` argument to False. We changed the transparency of the nodes with the `alpha` argument. Lastly, we changed the width of the edges with the `width` argument.
8. Circular layout
Often times, plotting a large network with the default layout can take a long time to calculate. While the default layout can show which nodes are more important, we can use a circular layout to illustrate the density of edges. First, we calculate the positions with `circular_layout` method and store that in its own variable, `circle_pos`. Then, we set the `pos` argument to that variable.
9. Let's practice!
Let's practice importing and visualizing networks in the following exercises.