Inspecting numeric range
Back to the movie dataset from the streaming startup. Before you downcast vote_count and budget to smaller dtypes, check their maximum values and compare them against the upper bounds of smaller integer types. This shows whether 32-bit (or 16-bit) is safe.
Deze oefening maakt deel uit van de cursus
Scaling and Optimizing Data Pipelines with Polars
Oefeninstructies
- Compute the max of
vote_countandbudget. - Show the upper bound of an
Int32and anInt16version ofbudget.
Interactieve oefening met praktijkervaring
Probeer deze oefening door deze voorbeeldcode aan te vullen.
result = movies.select(
# Largest vote_count
pl.col("vote_count").____().alias("vote_count_max"),
# Largest budget
pl.col("budget").____().alias("budget_max"),
# Upper bounds of smaller dtypes
pl.col("budget").cast(pl.Int32).____().alias("budget_int32_upper"),
pl.col("budget").cast(pl.Int16).____().alias("budget_int16_upper"),
)
print(result)