Get startedGet started for free

Creating an Object dtype

For an ad-hoc text exploration, the team wants each movie's title as a Python set of lowercase words. A set isn't a native Polars dtype, so store it in a column of dtype pl.Object.

The DataFrame movies is available, and the helper title_to_word_set(title) is already defined and returns a Python set of lowercase words from a title.

This exercise is part of the course

Scaling and Optimizing Data Pipelines with Polars

View Course

Exercise instructions

  • Run title_to_word_set on each movie_title and store the returned Python sets in a new column.
  • Alias the new column as title_word_set.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

result = movies.with_columns(
    pl.col("movie_title")
    # Convert each title to a set of words
    .____(title_to_word_set, return_dtype=pl.____)
    .alias("____")
).select("movie_title", "title_word_set").head(8)
print(result)
Edit and Run Code