Build a histogram (2): bins
In the previous exercise, you didn't specify the number of bins. By default, Python sets the number of bins to 10 in that case. The number of bins is pretty important. Too little bins oversimplifies reality, which doesn't show you the details. Too much bins overcomplicates reality and doesn't give the bigger picture.
To control the number of bins to divide your data in, you can set the bins argument.
That's exactly what you'll do in this exercise. You'll be making two plots here. The code in the script already includes plt.show() and plt.clf() calls; plt.show() displays a plot; plt.clf() cleans it up again so you can start afresh.
As before, life_exp is available and matploblib.pyplot is imported as plt.
This exercise is part of the course
Intermediate Python for Data Science
Exercise instructions
- Build a histogram of
life_exp, with5bins. Can you tell which bin contains the most observations? - Build another histogram of
life_exp, this time with20bins. Is this better?
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Build histogram with 5 bins
# Show and clean up plot
plt.show()
plt.clf()
# Build histogram with 20 bins
# Show and clean up again
plt.show()
plt.clf()