CommencerCommencez gratuitement

Downcasting numeric columns

Now that the ranges look safe, cast the numeric columns to smaller dtypes. Use Int32 for the integer columns and Float32 for the floats where lower precision is still good enough for summary stats.

The movies DataFrame is preloaded for you.

Cet exercice fait partie du cours

<cours>Scaling and Optimizing Data Pipelines with Polars</cours>
Voir le cours

Instructions de l’exercice

  • Cast vote_count and budget to pl.Int32.
  • Cast runtime and vote_average to pl.Float32.

Exercice interactif pratique

Essayez cet exercice en complétant ce code d’exemple.

movies_optimized = movies.with_columns(
    # Integer columns to Int32
    pl.col("vote_count").cast(pl.____),
    pl.col("budget").cast(pl.____),
    # Float columns to Float32
    pl.col("runtime").cast(pl.____),
    pl.col("vote_average").cast(pl.____),
)

result = movies_optimized.select(
    "movie_title", "budget", "runtime", "vote_average", "vote_count"
).head(8)
print(result)
Modifier et exécuter le code