Extracting a primary genre
You've joined the data team at a streaming movie startup. They store film metadata in a Parquet file where each movie's genres is a list of strings, since most films span several genres. Pull the first listed genre out of each row into a clean top-level primary_genre column.
polars is loaded as pl, and the DataFrame movies is pre-loaded from the Parquet file.
Bu egzersiz, kursun bir parçasıdır
Scaling and Optimizing Data Pipelines with Polars
Egzersiz talimatları
- Extract the first element from each row's
genreslist using a list expression. - Alias the new column as
primary_genre.
Uygulamalı etkileşimli egzersiz
Bu egzersizi bu örnek kodu tamamlayarak deneyin.
result = movies.select(
"movie_title",
"genres",
# Get the first genre
pl.col("genres").list.____(0).alias("____"),
).head(8)
print(result)