Get startedGet started for free

Twitter network analysis

1. Twitter network analysis

Network analysis of twitter data is a great way to decipher interdependencies between twitter users.

2. Lesson overview

In this lesson, we will understand the concepts of networks and how these concepts can be applied to social media. We will also create a retweet network for an interesting topic.

3. Network and network analysis

A network is a system of interconnected objects. Two classic examples of networks are local area network of computers and social media.

4. Network and network analysis

Network analysis is the process of mapping network objects to understand inter-dependencies and information flow within the network.

5. Components of a network

The two main components of a network are the nodes and edges. The objects that are interconnected in a network are called nodes or vertices. Every user in a twitter network is called a node or vertex.

6. Components of a network

The connections between these objects are called edges. How the users are connected determine the edges.

7. Directed vs undirected network

There are two broad types of networks based on the information flow: directed and undirected networks. When the edges of a network point towards one direction, the network is called a directed network. Here, the relationship between nodes only works in one direction.

8. Directed vs undirected network

When the edges of a network have no direction, the network is an undirected network. The relationship between nodes works in both directions in such networks.

9. Applications in social media

Twitter users tweet, like, follow, and retweet creating complex network structures. Analyzing the structure and size of networks facilitates identifying key players and influencers who are pivotal to transmitting information to a wide audience.

10. Retweet network

A retweet network is a network of twitter users who retweet tweets posted by other users. It is a directed network where the source vertex is the user who retweets and the target vertex is the user who posted the original tweet. Understanding the position of potential customers on a retweet network allows a brand to identify key players who are likely to retweet posts to spread brand messaging.

11. Retweet network of #OOTD

Let's create a retweet network of users who retweet on hashtag OOTD which means "Outfit Of The Day". This hashtag is popular amongst young users for flaunting their outfits and can be used by fashion brands to grab the attention of potential customers.

12. Create the tweet data frame

First, we extract 18000 tweets on hashtag OOTD using search_tweets() and include retweets.

13. Create data frame for the network

Next, we create a subset dataframe of screen_name and retweet_screen_name from the extracted tweets. For the retweet network, the source vertex is the screen_name and the target vertex is the retweet_screen_name. As some rows have NA values under retweet_screen_name, we will exclude these rows before proceeding further.

14. Include only retweets in the data frame

To remove rows with NA values, we use the complete.cases() function. This function takes the data frame as input and retains only rows without NA values.

15. Convert data frame to a matrix

To create a network, we need the contents saved as a matrix. The as.matrix() function converts the data frame to a matrix.

16. Create the retweet network

We are now ready to create the retweet network using graph_from_edgelist() from the igraph library. This function takes two arguments: the edge list, el, set to the matrix and directed set to TRUE for the directed network.

17. View the retweet network

We use the print.igraph() function to view the retweet network.

18. View the retweet network

Here, DN indicates that it is a directed network. The number of edges and vertices are 4100 and 4616 respectively. The source and target vertex names can be seen separated by arrows. For example, "MaikielYungin" is the source vertex and "ZingletC" is the target vertex in the first row. We have now successfully created a retweet network. We will identify key players from this network using network centrality measures in the next lesson.

19. Let's practice!

Let's practice by creating a retweet network on the topic hashtag travel!