1. Learn
  2. /
  3. Courses
  4. /
  5. R, Yelp and the Search for Good Indian Food

Exercise

Finding number of reviews per user

You now have a manageable data set with just one type of cuisine. It's time to begin adapting the Yelp star reviews to see if you can make them more meaningful. In this course, you will look into just two of the almost infinite ways one can scale and manipulate reviews. The first method is to create a new review that gives more weight to those who have reviewed more restaurants of the same cuisine.

To do this start by creating a new data frame with the number of reviews each reviewer has made for the collection of Indian restaurants in the original data set.

The new data frame will be created using the select(), group_by(), %>% and summarize() functions of the dplyr package. The select() function determines the columns that will be included in the new data frame. The group_by(), %>% and summarise() functions allow for separate summaries to be performed within the unique values of the variable being grouped.

After making the data frame, explore it! Check out the range in numbers of reviews and also the average number of reviews per user.

Instructions

100 XP
  • Create a new data frame number_reviews_indian by selecting columns: user_id, user_name, using group_by variable user_id and summarise() with n() to create total_reviews column
  • Print the table of total_reviews
  • Show the average number of reviews per users by averaging the total_reviews