Session Ready
Exercise

MSD summary statistics

Let's get familiar with the Million Songs Echo Nest Taste Profile data subset. For purposes of this course, we'll just call it the Million Songs dataset or msd. Let's get the number of users and the number of songs. Let's also see which songs have the most plays from this subset.

Instructions
100 XP
  • Use the .show() method to see what the data looks like.
  • Complete the code to count the number of distinct userIds. Select the userId column, then call .distinct() and .count().
  • Now do the same thing for the songIds, so count the number of distinct songIds. Select the songId column and call .distinct() and .count() on it.