Inspecting numeric range
Back to the movie dataset from the streaming startup. Before you downcast vote_count and budget to smaller dtypes, check their maximum values and compare them against the upper bounds of smaller integer types. This shows whether 32-bit (or 16-bit) is safe.
This exercise is part of the course
Scaling and Optimizing Data Pipelines with Polars
Exercise instructions
- Compute the max of
vote_countandbudget. - Show the upper bound of an
Int32and anInt16version ofbudget.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
result = movies.select(
# Largest vote_count
pl.col("vote_count").____().alias("vote_count_max"),
# Largest budget
pl.col("budget").____().alias("budget_max"),
# Upper bounds of smaller dtypes
pl.col("budget").cast(pl.Int32).____().alias("budget_int32_upper"),
pl.col("budget").cast(pl.Int16).____().alias("budget_int16_upper"),
)
print(result)