Creating an Enum column
Movies have a fixed status vocabulary like Released, Rumored, Post Production, and so on. Since the allowed values are known upfront, an Enum is a better fit than Categorical: it adds validation that catches any unknown status before it pollutes the pipeline.
movies is still available, along with a pre-defined status_enum listing every allowed value.
Diese Übung ist Teil des Kurses
<Kurs>Scaling and Optimizing Data Pipelines with Polars</Kurs>Übungsanweisungen
- Cast the
statuscolumn using the pre-definedstatus_enum.
Interaktive praktische Übung
Versuche dich an dieser Übung, indem du diesen Beispielcode vervollständigst.
result = movies.with_columns(
# Cast to the enum
pl.col("____").____(____)
).select("movie_title", "status").head(8)
print(result)