1. Learn
  2. /
  3. Courses
  4. /
  5. Practicing Machine Learning Interview Questions in Python

Connected

Exercise

The hunt for missing values

Questions about processing missing values are integral to any machine learning interview. If you are provided with a dataset with missing values, not addressing them will likely skew your results and lower your model's accuracy.

In this exercise, you'll practice the first pre-processing step by finding and exploring ways to handle missing values using pandas and numpy on a customer loan dataset.

The dataset, which you'll use for many of the exercises in this course, is saved to your workspace as loan_data.

This is where you are in the pipeline:

Machine learning pipeline

Instructions 1/4

undefined XP
  • 1
    • Print out the features of loan_data along with the number of missing values.
  • 2
    • Drop the rows with missing values and print the percentage of rows remaining.
  • 3
    • Drop the columns with missing values and print the percentage of columns remaining.
  • 4
    • Impute loan_data's missing values with 0 into loan_data_filled
    • Compare 'Credit Score' using .describe() before imputation using loan_data and after using loan_data_filled.