1. Learn
  2. /
  3. Courses
  4. /
  5. Monitoring Machine Learning in Python

Connected

Exercise

Data quality checks

As you learned in the previous video, missing values can result in a loss of valuable information and potentially lead to incorrect interpretations. Similarly, the presence of unseen values can also affect your model's confidence.

In this exercise, your goal is to explore whether the hotel booking dataset contains missing values and identify any unseen values. The reference and analysis datasets are already loaded, along with the nannyml library.

A quick reminder, if you can't recall the column types, you can easily explore the data using the .head() method.

Instructions 1/2

undefined XP
  • 1
    • Initialize the missing value calculator, passing the selected columns to column_names and setting the chunk_period to monthly.
  • 2
    • Add two categorical column names country and hotel, initialize the unseen values calculator, and pass the categorical_columns to column names.