MulaiMulai sekarang secara gratis

Extracting string patterns

The Length column in the hiking dataset is a column of strings, but contained in the column is the mileage for the hike. We're going to extract this mileage using regular expressions, and then use a lambda in pandas to apply the extraction to the DataFrame.

Latihan ini adalah bagian dari kursus

Preprocessing for Machine Learning in Python

Lihat Kursus

Petunjuk latihan

  • Search the text in the length argument for numbers and decimals using an appropriate pattern.
  • Extract the matched pattern and convert it to a float.
  • Apply the return_mileage() function to each row in the hiking["Length"] column.

Latihan interaktif praktis

Cobalah latihan ini dengan menyelesaikan kode contoh berikut.

# Write a pattern to extract numbers and decimals
def return_mileage(length):
    
    # Search the text for matches
    mile = re.____(____, ____)
    
    # If a value is returned, use group(0) to return the found value
    if mile is not None:
        return float(____)
        
# Apply the function to the Length column and take a look at both columns
hiking["Length_num"] = ____.apply(____)
print(hiking[["Length", "Length_num"]].head())
Edit dan Jalankan Kode