LoslegenKostenlos loslegen

NaN value imputation

Let's try to impute some values, using the .transform() method. In the previous task you created a DataFrame fheroes where all the groups with insufficient amount of bmi observations were removed. Our bmi column has a lot of missing values (NaNs) though. Given two copies of the fheroes DataFrame (imp_globmean and imp_grpmean), your task is to impute the NaNs in the bmi column with the overall mean value and with the mean value per group defined by Publisher and Alignment factors, respectively.

Tip: pandas Series and NumPy arrays have a special .fillna() method which substitutes all the encountered NaNs with a value specified as an argument.

Diese Übung ist Teil des Kurses

Practicing Coding Interview Questions in Python

Kurs anzeigen

Anleitung zur Übung

  • Define a lambda function that imputes NaN values in series with its mean.
  • Impute NaNs in the bmi column of imp_globmean with the overall mean value.
  • Impute NaNs in the bmi column of imp_grpmean with the mean value per group.

Interaktive Übung

Versuche dich an dieser Übung, indem du diesen Beispielcode vervollständigst.

# Define a lambda function that imputes NaN values in series
impute = lambda series: ____

# Impute NaNs in the bmi column of imp_globmean
imp_globmean['bmi'] = ____
print("Global mean = " + str(fheroes['bmi'].mean()) + "\n")

groups = imp_grpmean.groupby(['Publisher', 'Alignment'])

# Impute NaNs in the bmi column of imp_grpmean
imp_grpmean['bmi'] = groups[____].____
print(groups['bmi'].mean())
Code bearbeiten und ausführen