MulaiMulai sekarang secara gratis

Cleaning your dataset

Real-world datasets like the heart disease dataset are often messy, containing duplicated or missing values. In this exercise, you will apply the skills learned in this chapter to perform data cleaning on the heart disease dataset. The dataset has already been loaded for you. Your task is to identify and carry out general cleaning operations based on the EDA results: remove empty columns, drop duplicate rows, and perform imputation on the restecg column, which pertains to an electrocardiogram measure. Pandas has been imported for you as pd, and the heart disease dataset is stored as a pandas DataFrame called heart_disease_df.

Latihan ini adalah bagian dari kursus

End-to-End Machine Learning

Lihat Kursus

Latihan interaktif praktis

Cobalah latihan ini dengan menyelesaikan kode contoh berikut.

# Drop empty columns
heart_disease_column_dropped = heart_disease_df.____(____, ____)
Edit dan Jalankan Kode