Aan de slagGa gratis aan de slag

Extracting numbers from strings

The length_of_time field in the UFO dataset is a text field that has the number of minutes within the string. Here, you'll extract that number from that text field using regular expressions.

Deze oefening maakt deel uit van de cursus

Preprocessing for Machine Learning in Python

Cursus bekijken

Oefeninstructies

  • Search time_string for numbers using an appropriate RegEx pattern.
  • Use the .apply() method to call the return_minutes() on every row of the length_of_time column.
  • Print out the .head() of both the length_of_time and minutes columns to compare.

Praktische interactieve oefening

Probeer deze oefening eens door deze voorbeeldcode in te vullen.

def return_minutes(time_string):

    # Search for numbers in time_string
    num = re.____(____, ____)
    if num is not None:
        return int(num.group(0))
        
# Apply the extraction to the length_of_time column
ufo["minutes"] = ufo["length_of_time"].____

# Take a look at the head of both of the columns
print(ufo[[____]].head())
Code bewerken en uitvoeren