List comprehensions for time-stamped data
You will now make use of what you've learned from this chapter to solve a simple data extraction problem. You will also be introduced to a data structure, the pandas Series, in this exercise. We won't elaborate on it much here, but what you should know is that it is a data structure that you will be working with a lot of times when analyzing data from pandas DataFrames. You can think of DataFrame columns as single-dimension arrays called Series.
In this exercise, you will be using a list comprehension to extract the time from time-stamped Twitter data. The pandas package has been imported as pd
and the file 'tweets.csv'
has been imported as the df
DataFrame for your use.
This exercise is part of the course
Python Toolbox
Exercise instructions
- Extract the column
'created_at'
fromdf
and assign the result totweet_time
. Fun fact: the extracted column intweet_time
here is a Series data structure! - Create a list comprehension that extracts the time from each row in
tweet_time
. Each row is a string that represents a timestamp, and you will access the 12th to 19th characters in the string to extract the time. Useentry
as the iterator variable and assign the result totweet_clock_time
. Remember that Python uses 0-based indexing!
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Extract the created_at column from df: tweet_time
tweet_time = ____
# Extract the clock time: tweet_clock_time
tweet_clock_time = [____]
# Print the extracted times
print(tweet_clock_time)