Testing QuantileTransformer
Standardization is prone to the same pitfalls as z-scores. Both use mean and standardization in their calculations, which makes them highly sensitive to extreme values.
To get around this problem, you should use QuantileTransformer which uses quantiles. Quantiles of a distribution stay the same regardless of the magnitude of outliers.
You should use StandardScaler when the data is normally distributed (which can be checked with a histogram). For other distributions, QuantileTransformer is a better choice.
You'll practice on the loaded females dataset. matplotlib.pyplot is loaded under its standard alias, plt.
Este ejercicio forma parte del curso
Anomaly Detection in Python
Instrucciones del ejercicio
- Instantiate a
QuantileTransformer()that transforms features into a normal distribution and assigns it toqt. - Fit and transform the feature array
Xand preserve the column names. - Plot a histogram of the
palmlengthcolumn.
Ejercicio interactivo práctico
Prueba este ejercicio y completa el código de muestra.
from sklearn.preprocessing import QuantileTransformer
# Instantiate an instance that casts to normal
qt = ____
# Fit and transform the feature array
X.____ = ____
# Plot a histogram of palm length
plt.____(____, color='red')
plt.xlabel("Palm length")
plt.show()