Fast-path filter on sorted data
The historical analysis team needs every checkout record from before 2021. The CSV is already sorted by date, so if you tell Polars about that, it can stop scanning as soon as the first 2021 row appears.
Diese Übung ist Teil des Kurses
<Kurs>Scaling and Optimizing Data Pipelines with Polars</Kurs>Übungsanweisungen
- Mark the
datecolumn as sorted so Polars can use a fast-path scan. - Filter to rows where
dateis before January 1, 2021. - Execute the lazy query.
Interaktive praktische Übung
Versuche dich an dieser Übung, indem du diesen Beispielcode vervollständigst.
result = (
library
# Mark date as sorted to enable the fast-path scan
.____("date")
# Filter to rows before 2021-01-01
.filter(pl.col("date") < pl.____(2021, 1, 1))
# Execute the lazy query
.____()
)
print(result.head())