ComenzarEmpieza gratis

Unskew the variables

You will now transform the wholesale columns using Box-Cox transformation, and then explore the pairwise relationships plot to make sure the skewness of the distributions has been reduced to make them more normal. This is a critical step to make sure the K-means algorithm converges and discovers homogeneous groups (a.k.a. clusters or segments) of observations.

The stats module is loaded from the scipy library, and the wholesale dataset has been imported as a pandas DataFrame.

Este ejercicio forma parte del curso

Machine Learning for Marketing in Python

Ver curso

Instrucciones del ejercicio

  • Define a custom Box Cox transformation function that could be applied to a pandas DataFrame.
  • Apply the function to the wholesale dataset.
  • Plot the pairwise relationships between the transformed variables.
  • Display the chart.

Ejercicio interactivo práctico

Prueba este ejercicio y completa el código de muestra.

# Define custom Box Cox transformation function
def boxcox_df(x):
    x_boxcox, _ = stats.___(x)
    return x_boxcox

# Apply the function to the `wholesale` dataset
wholesale_boxcox = ___.___(boxcox_df, axis=0)

# Plot the pairwise relationships between the transformed variables 
sns.___(___, diag_kind='kde')

# Display the chart
plt.___()
Editar y ejecutar código