Get startedGet started for free

Wide Versus Long Data

In addition to tidy data, we can have long data versus wide data. We call a dataset as long data because the format of the data has many more rows than columns, and we call data wide data, because it has more columns than rows. You have seen how to transform a wide dataset (dem_score) into a long one with gather() and transform it into a different wide format with (spread).

In general, I tend to work with long data because this format makes it easeir to aggregate the data for plots when I have a lot of covariates. Let's look at what's possible because the data is in a long format.

Let's practice with another dataset in long format, called fertilityTidy. You can look at the original data as fertilityData. We'll summarize it in two different ways.

This exercise is part of the course

RBootcamp

View Course

Exercise instructions

  • Look at fertilityTidy. Show the average fertility by country to present day by using dplyr verbs, calling this variable meanCountryRate.
  • Assign the summarized data to fertilityMeanByCountry.
  • Show fertilityMeanByCountry.
  • Next, show average fertility by Year, using group_by/summarize() assigning the summarized data to fertilityMeanByYear.
  • Show fertilityMeanByYear.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

fertilityMeanByCountry <- fertilityTidy %>%

#show fertlityMeanByCountry
fertilityMeanByCountry

fertilityMeanByYear <- fertilityTidy %>%

#show fertilityMeanByYear
fertilityMeanByYear
Edit and Run Code