Are findings from the sample generalizable?
You just saw how convenience sampling—collecting data using the easiest method—can result in samples that aren't representative of the population. Equivalently, this means findings from the sample are not generalizable to the population. Visualizing the distributions of the population and the sample can help determine whether or not the sample is representative of the population.
The Spotify dataset contains an acousticness
column, which is a confidence measure from zero to one of whether the track was made with instruments that aren't plugged in. You'll compare the acousticness
distribution of the total population of songs with a sample of those songs.
spotify_population
and spotify_mysterious_sample
are available; pandas
as pd
, matplotlib.pyplot
as plt
, and numpy
as np
are loaded.
This exercise is part of the course
Sampling in Python
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Visualize the distribution of acousticness with a histogram
____
plt.show()