Exercise

Extracting an advanced regular expression

In this exercise, you will build on top of the prior exercises by creating a more advanced regular expression to capture the title of the movie, the company name of the distributor and the number of screens in each line of the screens_per_movie data frame.

Every line of screens_per_movie contains these three sections. Using extract you will extract these three and get three new columns with exactly the information you want in a tabular and structured form. This step is key if you want to make sense of unstructured data and bring it into a form that you can later analyze and visualize.

Instructions 1/3

undefined XP
    1
    2
    3
  • First, inspect the first three rows of screens_per_movie and make yourself familiar with the structure of the data.