1. Learn
  2. /
  3. Courses
  4. /
  5. Introduction to Python & Machine Learning (with Analytics Vidhya Hackathons)

Exercise

Impute missing values of SelfEmployed

Similarly, to impute missing values of Categorical variables, we look at the frequency table. The simplest way is to impute with value which has highest frequency because there is a higher probability of success.

For example, if you look at the distribution of SelfEmployed 500 out of 582 which is ~86% of total values falls under the category "No". Here we will replace missing values of SelfEmployed with "No".

train['Self_Employed'].fillna('No',inplace=True)

Instructions

100 XP
  • Impute missing values with more frequent category of Gender and Credit History
  • Use value_counts() to check more frequent category of variable