Filtering and aggregating by time period
1. Filtering and aggregating by time period
Now that we can parse datetimes, let's learn how to filter and aggregate our data.2. Finding specific prices
When analyzing data, we often need to answer questions like3. Finding specific prices
What was the price at noon? Or4. Finding specific prices
What were the prices in a 3-hour window?5. Filtering with pl.datetime
Let's find the prices at noon on July 5th. We start with a filter method.6. Filtering with pl.datetime
We compare the time column to a datetime value using the equality operator.7. Filtering with pl.datetime
And we use pl.datetime to specify the target datetime value.8. Filtering with pl.datetime
We specify year, month, day, and hour as named arguments. This returns the single row where time matches noon on July 5th. The price was zero pounds - free electricity during peak solar!9. Filtering by date with pl.date
Next, let's filter all rows from a specific day. We extract the date from our datetime column10. Filtering by date with pl.date
And compare it to a pl.date literal.11. Filtering by date with pl.date
This returns all 24 hours from July 5th.12. Filtering a datetime range
To filter a range, we use the is_between expression.13. Filtering a datetime range
Then we pass the start and end of our range.14. Filtering a datetime range
Leaving only rows between 8 AM and 6 PM.15. Aggregating by time period
Now let's turn to aggregation. We often need to aggregate data by time periods - for example, what's the average price over 3-hour windows? Polars has a powerful method for this called group_by_dynamic.16. Aggregating by time period
Before using group_by_dynamic, we must sort the data by the datetime column.17. Aggregating by time period
We chain group_by_dynamic and pass the name of the datetime column we want to group by.18. Aggregating by time period
The every parameter specifies the window size. Here we use 6h for 6-hour windows.19. Aggregating by 6-hour windows
Then we chain the agg method to specify what aggregations to compute.20. Aggregating by 6-hour windows
Here we calculate the mean price for each 6-hour window, aliasing the result as avg_price.21. Aggregating by 6-hour windows
Each row now represents a 6-hour window, with the average price during that window.22. Aggregating by day
For daily aggregation, we change every to 1d. We can compute multiple aggregations - here we get the average price and solar power.23. Aggregating by day
Now each row represents one day. We see the average price and solar power for each day, giving us a daily summary of key market factors.24. Controlling window size
The group_by_dynamic method lets you control how often windows start and how long they last. The every parameter sets how often a new window starts. The period parameter controls how long each window lasts. If we don't set a period, it equals every.25. Controlling window size
If we just set every to 2h26. Controlling window size
Then each row is in a single window.27. Controlling window size
But if we want to know every hour what the average price was over the next 2 hours, we set every to 1 hour and period to 2 hours.28. Controlling window size
Then the first window starts at 10 and lasts 2 hours.29. Controlling window size
And the second overlapping window starts at 11 and lasts 2 hours.30. Let's practice!
Now you can filter and aggregate time series data. Let's practice these powerful techniques!Create Your Free Account
or
By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.