Histograms for outlier detection
A histogram can be a compelling visual for finding outliers. They can become apparent when an appropriate number of bins is chosen for the histogram. Recall that the square root of the number of observations can be used as a rule of thumb for setting the number of bins. Usually, the bins with the lowest heights will contain outliers.
In this exercise, you'll plot the histogram of prices
from the previous exercise. numpy
and matplotlib.pyplot
are available under their standard aliases.
This is a part of the course
“Anomaly Detection in Python”
Exercise instructions
- Find the square root of the length of
prices
and store it asn_bins
. - Cast
n_bins
to an integer. - Create a histogram of
prices
, setting the number of bins ton_bins
.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Find the square root of the length of prices
n_bins = ____
# Cast to an integer
n_bins = ____(____)
plt.figure(figsize=(8, 4))
# Create a histogram
plt.____(____, ____=____, color='red')
plt.show()