Applying advanced transformations to DataFrames
pandas
has a plethora of built-in transformation tools, but sometimes, more advanced logic needs to be used in a transformation. The apply
function lets you apply a user-defined function to a row or column of a DataFrame, opening the door for advanced transformation and feature generation.
The find_street_name()
function parses the street name from the "street_address"
, dropping the street number from the string. This function has been loaded into memory, and is ready to be applied to the raw_testing_scores
DataFrame.
This exercise is part of the course
ETL and ELT in Python
Exercise instructions
- In the definition of the
transform()
function, use thefind_street_name()
function to create a new column with the name"street_name"
. - Use the
transform()
function to clean theraw_testing_scores
DataFrame. - Print the head of the
cleaned_testing_scores
DataFrame, observing the new"street_name"
column.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
def transform(raw_data):
# Use the apply function to extract the street_name from the street_address
raw_data["street_name"] = raw_data.____(
# Pass the correct function to the apply method
____,
axis=1
)
return raw_data
# Transform the raw_testing_scores DataFrame
cleaned_testing_scores = ____(raw_testing_scores)
# Print the head of the cleaned_testing_scores DataFrame
print(cleaned_testing_scores.____())