Get startedGet started for free

Creating a missing value dummy

Given a basetable that has a predictive variable "total_donations" that has the total number of donations a donor ever made. This variable can have missing values, indicating that this donor never made a donation before. This is important information on its own, so it is appropriate to create a variable "no_donations" that indicates whether "total_donations" is missing.

This exercise is part of the course

Intermediate Predictive Analytics in Python

View Course

Exercise instructions

  • Create a new column "no_donations" in basetable that has value 1 if total_donations is missing and 0 otherwise.
  • Calculate the number of missing values in total_donations and assign it to number_na.
  • Print the percentage of missing values.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Create dummy indicating missing values
basetable["____"] = pd.Series([____ if b else ____ for b in basetable["total_donations"].isna()])

# Calculate number of missing values
number_na = sum(____["no_donations"] == ____)

# Calculate percentage of missing values
print(round(____ / ____(____), 2))
Edit and Run Code