Get startedGet started for free

NaN value imputation

Let's try to impute some values, using the .transform() method. In the previous task you created a DataFrame fheroes where all the groups with insufficient amount of bmi observations were removed. Our bmi column has a lot of missing values (NaNs) though. Given two copies of the fheroes DataFrame (imp_globmean and imp_grpmean), your task is to impute the NaNs in the bmi column with the overall mean value and with the mean value per group defined by Publisher and Alignment factors, respectively.

Tip: pandas Series and NumPy arrays have a special .fillna() method which substitutes all the encountered NaNs with a value specified as an argument.

This exercise is part of the course

Practicing Coding Interview Questions in Python

View Course

Exercise instructions

  • Define a lambda function that imputes NaN values in series with its mean.
  • Impute NaNs in the bmi column of imp_globmean with the overall mean value.
  • Impute NaNs in the bmi column of imp_grpmean with the mean value per group.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Define a lambda function that imputes NaN values in series
impute = lambda series: ____

# Impute NaNs in the bmi column of imp_globmean
imp_globmean['bmi'] = ____
print("Global mean = " + str(fheroes['bmi'].mean()) + "\n")

groups = imp_grpmean.groupby(['Publisher', 'Alignment'])

# Impute NaNs in the bmi column of imp_grpmean
imp_grpmean['bmi'] = groups[____].____
print(groups['bmi'].mean())
Edit and Run Code