Adjusting the number of bins in a histogram
The histogram you just made had ten bins. This is the default of matplotlib. The "square root rule" is a commonly-used rule of thumb for choosing number of bins: choose the number of bins to be the square root of the number of samples. Plot the histogram of Iris versicolor petal lengths again, this time using the square root rule for the number of bins. You specify the number of bins using the bins
keyword argument of plt.hist()
.
The plotting utilities are already imported and the seaborn defaults already set. The variable versicolor_petal_length
contains an array of petal lengths and is already in your namespace.
This is a part of the course
“Statistical Thinking in Python (Part 1)”
Exercise instructions
- Import
numpy
asnp
. This gives access to the square root function,np.sqrt()
. - Determine how many data points you have using
len()
. - Compute the number of bins using the square root rule.
- Convert the number of bins to an integer using the built in
int()
function. - Generate the histogram and make sure to use the
bins
keyword argument. - Hit submit to plot the figure and see the fruit of your labors!
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Import numpy
# Compute number of data points: n_data
# Number of bins is the square root of number of data points: n_bins
# Convert number of bins to integer: n_bins
# Plot the histogram
# Label axes
_ = plt.xlabel('petal length (cm)')
_ = plt.ylabel('count')
# Show histogram
plt.show()
This exercise is part of the course
Statistical Thinking in Python (Part 1)
Build the foundation you need to think statistically and to speak the language of your data.
Before diving into sophisticated statistical inference techniques, you should first explore your data by plotting them and computing simple summary statistics. This process, called exploratory data analysis, is a crucial first step in statistical analysis of data.
Exercise 1: Introduction to Exploratory Data AnalysisExercise 2: What is the goal of statistical inference?Exercise 3: Advantages of graphical EDAExercise 4: Plotting a histogramExercise 5: Plotting a histogram of iris dataExercise 6: Axis labels!Exercise 7: Adjusting the number of bins in a histogramExercise 8: Plot all of your data: Bee swarm plotsExercise 9: Bee swarm plotExercise 10: Interpreting a bee swarm plotExercise 11: Plot all of your data: ECDFsExercise 12: Computing the ECDFExercise 13: Plotting the ECDFExercise 14: Comparison of ECDFsExercise 15: Onward toward the whole story!What is DataCamp?
Learn the data skills you need online at your own pace—from non-coding essentials to data science and machine learning.