Aggregating numerical features
A good use case for taking an aggregate statistic to create a new feature is when you have many features with similar, related values. Here, you have a DataFrame of running times named running_times_5k
. For each name
in the dataset, take the mean of their 5 run times.
This exercise is part of the course
Preprocessing for Machine Learning in Python
Exercise instructions
- Use the
.loc[]
method to select all rows and columns to find the.mean()
of the each columns. - Print the
.head()
of the DataFrame to see themean
column.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Use .loc to create a mean column
running_times_5k["mean"] = ____.loc[____, ____].____(axis=____)
# Take a look at the results
print(____)