Downcasting numeric columns
Now that the ranges look safe, cast the numeric columns to smaller dtypes. Use Int32 for the integer columns and Float32 for the floats where lower precision is still good enough for summary stats.
The movies DataFrame is preloaded for you.
Diese Übung ist Teil des Kurses
<Kurs>Scaling and Optimizing Data Pipelines with Polars</Kurs>Übungsanweisungen
- Cast
vote_countandbudgettopl.Int32. - Cast
runtimeandvote_averagetopl.Float32.
Interaktive praktische Übung
Versuche dich an dieser Übung, indem du diesen Beispielcode vervollständigst.
movies_optimized = movies.with_columns(
# Integer columns to Int32
pl.col("vote_count").cast(pl.____),
pl.col("budget").cast(pl.____),
# Float columns to Float32
pl.col("runtime").cast(pl.____),
pl.col("vote_average").cast(pl.____),
)
result = movies_optimized.select(
"movie_title", "budget", "runtime", "vote_average", "vote_count"
).head(8)
print(result)