Aan de slagGa gratis aan de slag

Testing QuantileTransformer

Standardization is prone to the same pitfalls as z-scores. Both use mean and standardization in their calculations, which makes them highly sensitive to extreme values.

To get around this problem, you should use QuantileTransformer which uses quantiles. Quantiles of a distribution stay the same regardless of the magnitude of outliers.

You should use StandardScaler when the data is normally distributed (which can be checked with a histogram). For other distributions, QuantileTransformer is a better choice.

You'll practice on the loaded females dataset. matplotlib.pyplot is loaded under its standard alias, plt.

Deze oefening maakt deel uit van de cursus

Anomaly Detection in Python

Cursus bekijken

Oefeninstructies

  • Instantiate a QuantileTransformer() that transforms features into a normal distribution and assigns it to qt.
  • Fit and transform the feature array X and preserve the column names.
  • Plot a histogram of the palmlength column.

Praktische interactieve oefening

Probeer deze oefening eens door deze voorbeeldcode in te vullen.

from sklearn.preprocessing import QuantileTransformer

# Instantiate an instance that casts to normal
qt = ____

# Fit and transform the feature array
X.____ = ____

# Plot a histogram of palm length
plt.____(____, color='red')

plt.xlabel("Palm length")
plt.show()
Code bewerken en uitvoeren