Session Ready
Exercise

The curious case of missing values

Rarely is the data captured perfectly in real world. People might not disclose few details or those details might not be available in the first place. This data set is no different. There are missing values in variables.

We need to first find out which variables have missing values, and then see what is the best way to handle these missing values. The way to handle a missing value can depend on the number of missing values, the type of variable and the expected importance of those variables.

So, let's start by finding out whether variable "Credit_history" has missing values or not and if so, how many observations are missing.


train['Credit_History'].isnull().sum()

  • isnull() helps to check the observation has missing value or not (It returns a boolean value TRUE or FALSE)
  • sum() used to return the number of records have missing values
Instructions
100 XP
  • Apply isnull() to check the observation has null value or not
  • Check number of missing values is greater than 0 or not