Creating time series data frame
Time series data is used when we want to analyze or explore variation over time. This is useful when exploring Twitter text data if we want to track the prevalence of a word or set of words.
The first step in doing this is converting the DataFrame into a format which can be handled using pandas time series methods. That can be done by converting the index to a datetime type.
Deze oefening maakt deel uit van de cursus
Analyzing Social Media Data in Python
Oefeninstructies
- Print the first five rows of
created_atinds_tweetswith the.head()method. - Convert that column to a datetime type with the Pandas'
.to_datetime()method. - Print the first five rows once again.
- Set index to
created_atwith.set_index().
Praktische interactieve oefening
Probeer deze oefening eens door deze voorbeeldcode in te vullen.
# Print created_at to see the original format of datetime in Twitter data
print(ds_tweets[____].____())
# Convert the created_at column to np.datetime object
ds_tweets[____] = pd.____(____)
# Print created_at to see new format
print(ds_tweets[____].____())
# Set the index of ds_tweets to created_at
ds_tweets = ds_tweets.____(____)