A little bit of Twitter text analysis
Now that you have your DataFrame of tweets set up, you're going to do a bit of text analysis to count how many tweets contain the words 'clinton', 'trump', 'sanders' and 'cruz'. In the pre-exercise code, we have defined the following function word_in_text(), which will tell you whether the first argument (a word) occurs within the 2nd argument (a tweet).
import re
def word_in_text(word, tweet):
word = word.lower()
text = tweet.lower()
match = re.search(word, tweet)
if match:
return True
return False
You're going to iterate over the rows of the DataFrame and calculate how many tweets contain each of our keywords!
This exercise is part of the course
Importing Data in Python
Exercise instructions
- Initialize the list
[clinton, trump, sanders, cruz]so that all values are0. - Within the
forloopfor index, row in df.iterrows():, the code currently increases the value ofclintonby1each time a tweet mentioning 'Clinton' is encountered; complete the code so that the same happens fortrump,sandersandcruz.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Initialize list to store tweet counts
[clinton, trump, sanders, cruz] = ____
# Iterate through df, counting the number of tweets in which
# each candidate is mentioned
for index, row in df.iterrows():
clinton += word_in_text('clinton', row['text'])
trump += word_in_text(____, ____)
sanders += word_in_text(____, ____)
cruz += word_in_text(____, ____)