MulaiMulai sekarang secara gratis

Unskew the variables

You will now transform the wholesale columns using Box-Cox transformation, and then explore the pairwise relationships plot to make sure the skewness of the distributions has been reduced to make them more normal. This is a critical step to make sure the K-means algorithm converges and discovers homogeneous groups (a.k.a. clusters or segments) of observations.

The stats module is loaded from the scipy library, and the wholesale dataset has been imported as a pandas DataFrame.

Latihan ini adalah bagian dari kursus

Machine Learning for Marketing in Python

Lihat Kursus

Petunjuk latihan

  • Define a custom Box Cox transformation function that could be applied to a pandas DataFrame.
  • Apply the function to the wholesale dataset.
  • Plot the pairwise relationships between the transformed variables.
  • Display the chart.

Latihan interaktif praktis

Cobalah latihan ini dengan menyelesaikan kode contoh berikut.

# Define custom Box Cox transformation function
def boxcox_df(x):
    x_boxcox, _ = stats.___(x)
    return x_boxcox

# Apply the function to the `wholesale` dataset
wholesale_boxcox = ___.___(boxcox_df, axis=0)

# Plot the pairwise relationships between the transformed variables 
sns.___(___, diag_kind='kde')

# Display the chart
plt.___()
Edit dan Jalankan Kode