Simple sampling with dplyr
Throughout this chapter you'll be exploring song data from Spotify. Each row of the dataset represents a song, and there are 41656 rows. Columns include the name of the song, the artists who performed it, the release year, and attributes of the song like its duration, tempo, and danceability. We'll start by looking at the durations.
Your first task is to sample the song dataset and compare a calculation on the whole population and on a sample.
spotify_population
is available and dplyr
is loaded.
This exercise is part of the course
Sampling in R
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# View the whole population dataset
___
# Sample 1000 rows from spotify_population
spotify_sample <- ___
# See the result
spotify_sample