Working with the streaming and GPU engines

1. Working with the streaming and GPU engines

Welcome back. The Chicago team now has efficient lazy queries and Parquet inputs. But what happens when the data gets too large for their laptops to handle?

2. Query execution engines

The team has learned that after they write a lazy query, Polars passes it through its query optimizer. What they haven't explored yet is that Polars offers different options, called engines, for how the optimized plan is executed.

3. Query execution engines

By default, Polars processes the query with its in-memory engine.

4. Query execution engines

But they can also choose to use the streaming engine that allows processing larger-than-memory datasets.

5. Query execution engines

The team also has access to some Graphical Processing Units, or GPUs, made by NVIDIA, so they could run their queries on those too. Let's see each engine in action.

6. A lazy request summary

The team starts with a lazy query over their 311 Parquet file. This query filters to completed requests, counts them by department, and sorts the result. So far, nothing has run. We have only described the work.

7. Using the default engine

Calling collect with no engine argument uses the default engine. With the default engine, all of the data must fit in memory. This works for development on their laptops, but the team is starting to struggle as the dataset keeps growing.

8. Targeting the streaming engine

To target the streaming engine, they pass engine equals streaming to collect. For this query, the result is the same. The difference is how Polars executes the work internally. Streaming is most useful when the raw data is much larger than the final result.

9. Targeting the streaming engine

The streaming engine works by breaking the dataset into batches that are processed independently before being recombined. This allows Polars to process larger-than-memory datasets because not all batches need to be in memory at once. Not every Polars method supports streaming yet. But if part of a pipeline can't stream, Polars automatically falls back to the default engine, so nothing breaks.

10. Targeting the streaming engine

If the team wants to run on the streaming engine, they don't have to specify it every time they call collect. They can just set the engine affinity config parameter at the start of their code.

11. Targeting the GPU engine

The GPU engine works the same way. Pass engine equals gpu. It does require NVIDIA GPU hardware.

12. Choosing an engine

Our advice: start with the default engine. If a query runs out of memory, switch to streaming. The Polars developers actually plan to make streaming the default in the future, so the team could adopt it now. The GPU engine can be faster, but the difference isn't always dramatic, so there's no need to rearrange pipelines just for a GPU. But what if the team doesn't want one final DataFrame and instead needs to process results in smaller pieces?

13. Working in batches

Sometimes a pipeline needs to send data somewhere else, like an API or a message queue, and doing it all in one shot isn't practical. The team begins with a lazy query for completed service requests.

14. Processing each batch

They can iterate over batches of the query by calling collect_batches.

15. Processing each batch

They set the batch size with chunk_size, here 50 thousand rows per chunk.

16. Processing each batch

Each batch arrives as a regular DataFrame. Here, they print its shape and convert it to JSON for sending over the network. The last batch is smaller because that's just the remainder. This way, the team stays within whatever memory or network limits they have.

17. Let's practice!

Now let's practice targeting different Polars execution engines.

Create Your Free Account

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.