Get startedGet started for free

Testing QuantileTransformer

Standardization is prone to the same pitfalls as z-scores. Both use mean and standardization in their calculations, which makes them highly sensitive to extreme values.

To get around this problem, you should use QuantileTransformer which uses quantiles. Quantiles of a distribution stay the same regardless of the magnitude of outliers.

You should use StandardScaler when the data is normally distributed (which can be checked with a histogram). For other distributions, QuantileTransformer is a better choice.

You'll practice on the loaded females dataset. matplotlib.pyplot is loaded under its standard alias, plt.

This exercise is part of the course

Anomaly Detection in Python

View Course

Exercise instructions

  • Instantiate a QuantileTransformer() that transforms features into a normal distribution and assigns it to qt.
  • Fit and transform the feature array X and preserve the column names.
  • Plot a histogram of the palmlength column.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

from sklearn.preprocessing import QuantileTransformer

# Instantiate an instance that casts to normal
qt = ____

# Fit and transform the feature array
X.____ = ____

# Plot a histogram of palm length
plt.____(____, color='red')

plt.xlabel("Palm length")
plt.show()
Edit and Run Code