Get startedGet started for free

Components of twitter data

1. Components of twitter data

In this lesson, we will look at the components of twitter data and understand how they can be used to derive insights.

2. Lesson Overview

We start with an introduction to Twitter JSON and how the components of twitter metadata are extracted from the JSON. We will then use some of these components to derive insights.

3. Twitter JSON

A tweet can have over 150 metadata components. Twitter APIs return tweets and their components as "JavaScript Object Notation" or JSON objects.

4. JSON attributes and values

JSON uses named attributes and values to describe tweets and their components. Attributes describe metadata associated with a tweet. For example, the attribute screen_name stores a user's twitter handle as its value.

5. Converting JSON to a dataframe

The twitter JSON is converted into a data frame by the rtweet library. The JSON attributes and their values are converted to column names and corresponding values in the data frame. The tweet and its components in the data frame can be used for analysis.

6. Viewing components of tweets

To view the tweet components that we had in the JSON, let's extract tweets on "hashtag brexit" using search_tweets() and view the column names using names().

7. Viewing components of tweets

We can see 90 column names which include the tweet and its components. We will use various components throughout the course to draw insights.

8. Exploring components

In this lesson, we will focus on three components: screen_name to understand user interest, followers_count to compare social media influence, and retweet_count and text to identify popular tweets.

9. User interest and tweet counts

screen_name refers to the twitter handle of a user. Based on the number of tweets posted, we can identify people interested in a topic and promote products. Let's find die-hard fans of the soccer club "Arsenal" based on their tweet counts.

10. User interest and tweet counts

First, we extract tweets on "hashtag Arsenal" using search_tweets(). Use the table() function to build a contingency table of the tweet counts by counting the number of tweets against each screen_name. We now have the screen names in the first row and the tweet counts in the second row.

11. User interest and tweet counts

Next, we sort the table in descending order of tweet counts and view the top 6 users who have tweeted the most. As you can see, the user "underscore whatthesport" has tweeted the most on "Arsenal".

12. Follower count

followers_count stores the count of followers subscribed to a twitter account. It indicates the popularity of a twitter account and is a measure of social media influence. Digital marketers position ads on popular twitter accounts for increased visibility.

13. Compare follower count

We can compare follower count for the series "Game of Thrones", "Fleabag", and "Breaking Bad" by using the screen names of these shows. The lookup_users() function takes screen names as input and extracts user data for twitter accounts. From the user data, create a data frame to extract the columns screen_name and followers_count.

14. Compare follower count

We see that "Game Of Thrones" is the most popular series with 8.6 million followers.

15. Retweet counts and popular tweets

A retweet is a tweet re-shared by another user. The retweet_count column stores the number of retweets. It is useful in identifying trends and promoting brands using popular retweets.

16. Retweet counts and popular tweets

We create a data frame with the columns retweet_count and text and sort in descending order of the retweet counts using arrange().

17. Retweet counts and popular tweets

Next, we use unique() function to exclude rows with duplicate tweets. This function takes two arguments: The data frame and the column text for removing duplicate tweets.

18. Retweet counts and popular tweets

The most retweeted texts have popular quotes such as "Once a Gunner, Always a Gunner" and "Never give up Gunners", indicating the loyalty of Arsenal fans. These tweets can be used for promoting Arsenal merchandise and brand loyalty.

19. Let's practice!

Let's practice exploring components of twitter data.