NaN value imputation
Let's try to impute some values, using the .transform() method. In the previous task you created a DataFrame fheroes where all the groups with insufficient amount of bmi observations were removed. Our bmi column has a lot of missing values (NaNs) though. Given two copies of the fheroes DataFrame (imp_globmean and imp_grpmean), your task is to impute the NaNs in the bmi column with the overall mean value and with the mean value per group defined by Publisher and Alignment factors, respectively.
Tip: pandas Series and NumPy arrays have a special .fillna() method which substitutes all the encountered NaNs with a value specified as an argument.
Este ejercicio forma parte del curso
Practicing Coding Interview Questions in Python
Instrucciones del ejercicio
- Define a lambda function that imputes
NaNvalues inserieswith its mean. - Impute
NaNs in thebmicolumn ofimp_globmeanwith the overall mean value. - Impute
NaNs in thebmicolumn ofimp_grpmeanwith the mean value per group.
Ejercicio interactivo práctico
Prueba este ejercicio y completa el código de muestra.
# Define a lambda function that imputes NaN values in series
impute = lambda series: ____
# Impute NaNs in the bmi column of imp_globmean
imp_globmean['bmi'] = ____
print("Global mean = " + str(fheroes['bmi'].mean()) + "\n")
groups = imp_grpmean.groupby(['Publisher', 'Alignment'])
# Impute NaNs in the bmi column of imp_grpmean
imp_grpmean['bmi'] = groups[____].____
print(groups['bmi'].mean())