1. Learn
  2. /
  3. Courses
  4. /
  5. Machine Learning for Marketing in Python

Exercise

Unskew the variables

You will now transform the wholesale columns using Box-Cox transformation, and then explore the pairwise relationships plot to make sure the skewness of the distributions has been reduced to make them more normal. This is a critical step to make sure the K-means algorithm converges and discovers homogeneous groups (a.k.a. clusters or segments) of observations.

The stats module is loaded from the scipy library, and the wholesale dataset has been imported as a pandas DataFrame.

Instructions

100 XP
  • Define a custom Box Cox transformation function that could be applied to a pandas DataFrame.
  • Apply the function to the wholesale dataset.
  • Plot the pairwise relationships between the transformed variables.
  • Display the chart.