readxl (1)

1. readxl (1)

Another tool that is extremely common in data analysis is

2. Microsoft Excel

Microsoft Excel. It shouldn't be a surprise that there are a lot of packages out there that interact with Excel so that you can work with the data inside R. In this chapter, we'll cover everything you need to know to get started with excel files in R with practically no extra work. In this video, I'll be taking about Hadley Wickham's readxl package.

3. Typical Structure Excel Data

Before we dive into the R side of things, it's a good idea to quickly wrap up on what an excel file typically is. For most data related tasks, an excel file contains different sheets that contain tabular data. Take this excel file for example, cities-dot-xlsx, that contains two sheets containing data of the total population of some capitals for two different years. When you're working in R, you'll typically want to explore your excel file first and then import some data from it. But how do you go about this?

4. readxl

This is where the readxl package comes in. It basically contains two main functions: excel_sheets and read_excel. The first one is used to list the different sheets in your excel file, while the second one is used to actually import a sheet into your R session. readxl is able to handle both xls as xlsx files. So, let's try it out! Let's first install and load the readxl package and then

5. excel_sheets()

start with the excel_sheets function. You simply pass it the file path, which is the location of the dot-xls file on your own system. The file is already in our working directory, as dir reveals, so we can simply use the following call. The result is a simple character vector, that contains the names of the different sheets. Indeed, you saw before that the two sheets in the excel file are named year_1990 and year_2000. Great. We already know about the sheet names now, but that's just the names, not the actual population data.

6. read_excel()

Fortunately, readxl also features the read_excel function to actually import the sheet data into your R session. In its most basic use, you simply specify the path to the excel file again. By default, the first sheet, year_1990 is imported as a tibble. You can also explicitly tell read_excel which sheet to import, by setting the sheet argument. You can use both an index or the sheet name. Suppose you want to load in the second sheet, named year_2000. The following two calls both do that. In all of these read_excel calls, an R data frame results that contains the Excel data. You can start your analyses right away!

7. Let's practice!

Give it a first try in the exercises. In the next video, I'll dive a little deeper into the read_excel function!