Creating a Categorical column
Back to the movie dataset from the streaming startup. Most rows in original_language use the same handful of codes (en, fr, es, …). Casting to Categorical encodes each unique label as an integer behind the scenes, saving memory on repeated strings.
Bu egzersiz, kursun bir parçasıdır
Scaling and Optimizing Data Pipelines with Polars
Egzersiz talimatları
- Cast the
original_languagecolumn topl.Categorical.
Uygulamalı etkileşimli egzersiz
Bu egzersizi bu örnek kodu tamamlayarak deneyin.
movies_cat = movies.with_columns(
# Cast to Categorical to save memory
pl.col("original_language").____(pl.____)
)
result = movies_cat.select("movie_title", "original_language").head(8)
print(result)