Are these findings generalizable?
Let's look at another sample to see if it is representative of the population. This time, you'll look at the duration_minutes
column of the Spotify dataset, which contains the length of the song in minutes.
spotify_population
and spotify_mysterious_sample2
are available; pandas
, matplotlib.pyplot
, and numpy
are loaded using their standard aliases.
This exercise is part of the course
Sampling in Python
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Visualize the distribution of duration_minutes as a histogram
spotify_population['duration_minutes'].hist(____)
plt.show()