Work that Excel data!
Now that you can read in Excel data, let's try to clean and merge it. You already used the cbind() function some exercises ago. Let's take it one step further now.
The urbanpop.xls dataset is available in your working directory. The file still contains three sheets, and has column names in the first row of each sheet.
This exercise is part of the course
Importing Data in R (Part 1)
Exercise instructions
- Add code to read the data from the third sheet in
"urbanpop.xls". You want to end up with three data frames:urban_sheet1,urban_sheet2andurban_sheet3. - Extend the
cbind()call so that it also includesurban_sheet3. Make sure the first column ofurban_sheet2andurban_sheet3are removed, so you don't have duplicate columns. Store the result inurban. - Use
na.omit()on theurbandata frame to remove all rows that containNAvalues. Store the cleaned data frame asurban_clean. - Print a summary of
urban_cleanand assert that there are no moreNAvalues.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Add code to import data from all three sheets in urbanpop.xls
path <- "urbanpop.xls"
urban_sheet1 <- read.xls(path, sheet = 1, stringsAsFactors = FALSE)
urban_sheet2 <- read.xls(path, sheet = 2, stringsAsFactors = FALSE)
# Extend the cbind() call to include urban_sheet3: urban
urban <- cbind(urban_sheet1, urban_sheet2[-1])
# Remove all rows with NAs from urban: urban_clean
# Print out a summary of urban_clean