Get startedGet started for free

Filtering and aggregating by time period

1. Filtering and aggregating by time period

Now that we can parse datetimes, let's learn how to filter and aggregate our data.

2. Finding specific prices

When analyzing data, we often need to answer questions like

3. Finding specific prices

What was the price at noon? Or

4. Finding specific prices

What were the prices in a 3-hour window?

5. Filtering with pl.datetime

Let's find the prices at noon on July 5th. We start with a filter method.

6. Filtering with pl.datetime

We compare the time column to a datetime value using the equality operator.

7. Filtering with pl.datetime

And we use pl.datetime to specify the target datetime value.

8. Filtering with pl.datetime

We specify year, month, day, and hour as named arguments. This returns the single row where time matches noon on July 5th. The price was zero pounds - free electricity during peak solar!

9. Filtering by date with pl.date

Next, let's filter all rows from a specific day. We extract the date from our datetime column

10. Filtering by date with pl.date

And compare it to a pl.date literal.

11. Filtering by date with pl.date

This returns all 24 hours from July 5th.

12. Filtering a datetime range

To filter a range, we use the is_between expression.

13. Filtering a datetime range

Then we pass the start and end of our range.

14. Filtering a datetime range

Leaving only rows between 8 AM and 6 PM.

15. Aggregating by time period

Now let's turn to aggregation. We often need to aggregate data by time periods - for example, what's the average price over 3-hour windows? Polars has a powerful method for this called group_by_dynamic.

16. Aggregating by time period

Before using group_by_dynamic, we must sort the data by the datetime column.

17. Aggregating by time period

We chain group_by_dynamic and pass the name of the datetime column we want to group by.

18. Aggregating by time period

The every parameter specifies the window size. Here we use 6h for 6-hour windows.

19. Aggregating by 6-hour windows

Then we chain the agg method to specify what aggregations to compute.

20. Aggregating by 6-hour windows

Here we calculate the mean price for each 6-hour window, aliasing the result as avg_price.

21. Aggregating by 6-hour windows

Each row now represents a 6-hour window, with the average price during that window.

22. Aggregating by day

For daily aggregation, we change every to 1d. We can compute multiple aggregations - here we get the average price and solar power.

23. Aggregating by day

Now each row represents one day. We see the average price and solar power for each day, giving us a daily summary of key market factors.

24. Controlling window size

The group_by_dynamic method lets you control how often windows start and how long they last. The every parameter sets how often a new window starts. The period parameter controls how long each window lasts. If we don't set a period, it equals every.

25. Controlling window size

If we just set every to 2h

26. Controlling window size

Then each row is in a single window.

27. Controlling window size

But if we want to know every hour what the average price was over the next 2 hours, we set every to 1 hour and period to 2 hours.

28. Controlling window size

Then the first window starts at 10 and lasts 2 hours.

29. Controlling window size

And the second overlapping window starts at 11 and lasts 2 hours.

30. Let's practice!

Now you can filter and aggregate time series data. Let's practice these powerful techniques!

Create Your Free Account

or

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.