MulaiMulai sekarang secara gratis

Fill dummy values

Similar to how you tried to find any relation of missing to missing values between columns, it is also important to find any relation of missing to non-missing values between columns. This will help you realize any factors for missingness in the data.

BMI vs Serum Insulin

In the above figure, you can observe that the missing values of Serum Insulin are spread across the range of BMI values. This only implies that there is no relation!

In this exercise, you will write a function to generate dummy values to help create the above scatter plot (in the next exercise). The operations to generate dummy values involve scaling the random values to a column range with a scaling factor and shifting the values.

The function rand() has been imported for you from numpy.random.

Latihan ini adalah bagian dari kursus

Dealing with Missing Data in Python

Lihat Kursus

Latihan interaktif praktis

Cobalah latihan ini dengan menyelesaikan kode contoh berikut.

def fill_dummy_values(df):
  df_dummy = df.copy(deep=True)
  for col_name in df_dummy:
    col = df_dummy[col_name]
    # Calculate column range
    col_range = ___ - ___
  return df_dummy
Edit dan Jalankan Kode