Slicing time series
Slicing is particularly useful for time series since it's a common thing to want to filter for data within a date range. Add the date column to the index, then use .loc[] to perform the subsetting. The important thing to remember is to keep your dates in ISO 8601 format, that is, "yyyy-mm-dd" for year-month-day, "yyyy-mm" for year-month, and "yyyy" for year.
Recall from Chapter 1 that you can combine multiple Boolean conditions using logical operators, such as &. To do so in one line of code, you'll need to add parentheses () around each condition.
pandas is loaded as pd and temperatures, with no index, is available.
This exercise is part of the course
Data Manipulation with pandas
Exercise instructions
- Use Boolean conditions, not
.isin()or.loc[], and the full date"yyyy-mm-dd", to subsettemperaturesfor rows where thedatecolumn is in 2010 and 2011 and print the results. - Set the index of
temperaturesto thedatecolumn and sort it. - Use
.loc[]to subsettemperatures_indfor rows in 2010 and 2011. - Use
.loc[]to subsettemperatures_indfor rows from August 2010 to February 2011.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Use Boolean conditions to subset temperatures for rows in 2010 and 2011
temperatures_bool = ____[(____ >= ____) & (____ <= ____)]
print(temperatures_bool)
# Set date as the index and sort the index
temperatures_ind = temperatures.____.____
# Use .loc[] to subset temperatures_ind for rows in 2010 and 2011
print(____)
# Use .loc[] to subset temperatures_ind for rows from Aug 2010 to Feb 2011
print(____)