Loading tweets into a DataFrame
Now it's time to import data into a pandas DataFrame so we can analyze tweets at scale.
We will work with a dataset of tweets which contain the hashtag '#rstats' or '#python'. This dataset is stored as a list of tweet JSON objects in data_science_json.
This course touches on a lot of concepts you may have forgotten, so if you ever need a quick refresher, download the pandas basics Cheat Sheet and keep it handy!
Be aware that this is real data from Twitter and as such there is always a risk for the presence of profanity or other offensive content (in this exercise, and any following exercises that also use real Twitter data).
Cet exercice fait partie du cours
<cours>Analyzing Social Media Data in Python</cours>Instructions de l’exercice
- Import
pandas(remember, by convention we'll alias it aspd). - Flatten the
data_science_jsontweets withflatten_tweets()and store them intweets. - Create a DataFrame from
tweetsusingpd.DataFrame(). - Print out the text from the first 5 tweets.
Exercice interactif pratique
Essayez cet exercice en complétant ce code d’exemple.
# Import pandas
import ____ as ____
# Flatten the tweets and store in `tweets`
tweets = ____(____)
# Create a DataFrame from `tweets`
ds_tweets = ____(____)
# Print out the first 5 tweets from this dataset
print(ds_tweets[____].values[0:5])