Get startedGet started for free

Extracting numbers from strings

The length_of_time field in the UFO dataset is a text field that has the number of minutes within the string. Here, you'll extract that number from that text field using regular expressions.

This exercise is part of the course

Preprocessing for Machine Learning in Python

View Course

Exercise instructions

  • Search time_string for numbers using an appropriate RegEx pattern.
  • Use the .apply() method to call the return_minutes() on every row of the length_of_time column.
  • Print out the .head() of both the length_of_time and minutes columns to compare.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

def return_minutes(time_string):

    # Search for numbers in time_string
    num = re.____(____, ____)
    if num is not None:
        return int(num.group(0))
        
# Apply the extraction to the length_of_time column
ufo["minutes"] = ufo["length_of_time"].____

# Take a look at the head of both of the columns
print(ufo[[____]].head())
Edit and Run Code