Finding keywords
Counting known keywords is one of the first ways you can analyze text data in a Twitter dataset. In this dataset, you're going to count the number of times specific hashtags occur in a collection of tweets about data science. To this end, you're going to use the string methods in the pandas Series object to do this.
pandas
and numpy
have been imported as pd
and np
, respectively. A more fully-featured flatten_tweets
and data_science_json
have also been loaded for you.
Este exercício faz parte do curso
Analyzing Social Media Data in Python
Instruções do exercício
- Flatten the tweets with
flatten_tweets()
and store them inflat_tweets
. - Convert tweets to DataFrame using the pandas DataFrame constructor.
- Find mentions of
#python
in'text'
, ignoring case. - Print proportion of tweets mentioning
#python
by summingpython
withnp.sum()
and dividing it by the total number of tweets.
Exercício interativo prático
Experimente este exercício completando este código de exemplo.
# Flatten the tweets and store them
____ = ____(____)
# Convert to DataFrame
ds_tweets = ____.____(____)
# Find mentions of #python in 'text'
python = ____[____].____.____(____, ____)
# Print proportion of tweets mentioning #python
print("Proportion of #python tweets:", ____ / ____)