Get startedGet started for free

Analyzing twitter data

1. Analyzing twitter data

Welcome to the course on analyzing social media data with R. I am Sowmya Vivek, a data science coach and consultant in analytics and NLP.

2. Course Overview

In this course, you will learn to extract and visualize twitter data, analyze tweet text, perform network analysis, and view tweets on the map. We will explore tweets on celebrities, brands, hot topics, and sports. Let's get started with understanding the need for analyzing twitter data and the pros and cons of using twitter data.

3. Introduction to social media analysis

Social media analysis is the process of collecting data from social media websites and analyzing this data to derive insights for better business decisions.

4. About Twitter

Twitter is a popular social media platform where people communicate via short messages called tweets. It is popular for micro-blogging and has a huge amount of information available in the form of tweets and the metadata around the tweets.

5. Power of twitter data

How powerful is twitter data in terms of available information? Let's look at some facts and figures: With 500 million tweets sent each day

6. Power of twitter data

and 330 million people tweeting every month, the information available for analysis is enormous.

7. Power of twitter data

According to Twitter, 80% of its users are affluent millennials.

8. Power of twitter data

Ads on twitter during live events are 11% more effective on audience engagement than TV ads.

9. Power of twitter data

40% of users say that they have made a purchase because of influencers' tweets. All these facts provide a strong motive for using twitter data for analysis.

10. Volume of tweets

To demonstrate the volume and velocity of tweets, we will look at a simple example. Many functions are available in R to extract tweets for analysis and some of these will be covered in the forthcoming lessons. One such function stream_tweets() samples 1% of all publicly available live tweets for a 30-second window by default.

11. Volume of tweets

In this example, the live tweets extracted using stream_tweets() are saved in a data frame. Upon viewing the dimensions, we see that 1047 live tweets were extracted. This is just a 1% random sample for a 30-second window which indicates the magnitude and velocity of tweets posted. Also, there are 90 columns providing rich information about each tweet.

12. Volume of tweets

We can extract live tweets using the same function and specify a time window of 60 seconds under the timeout argument. You can see that the number of live tweets extracted has more than doubled to 3464 now.

13. Applications of twitter data

Twitter data can be used for a wide range of applications such as understanding current topics trending across the world,

14. Applications of twitter data

evaluating customer opinion about a brand,

15. Applications of twitter data

analyzing the public sentiment of a political party, leader, or an event,

16. Applications of twitter data

visualizing reach of a movie, brand, or personality, and

17. Applications of twitter data

detecting events like an epidemic or a protest.

18. Advantages of twitter data

The biggest advantage of using twitter for social media analysis is that the Twitter API is more open and accessible compared to that of other social media platforms. It is easier to find and follow conversations on twitter because of the hashtag norms. Since the length of tweets is limited, running algorithms is easy and controlled.

19. Limitations of twitter data

Let's look at the limitations of twitter data. Twitter limits the historical search for a free account. There are also limitations on the number of tweets that can be extracted for a free account. The tweets extracted are a 1% sample of all the tweets and so may not be an accurate representation. Besides, only a very small percentage of tweets are accurately tagged for geographic location.

20. Let's practice!

Let's practice what we learned before we do a deep dive on extracting and analyzing twitter data.