Preparing Data For Analysis: Import
Let's get a little more practice with importing and preparing data for analysis. Go ahead and declare the file paths, and use the rxImport() command to import the airline data to an xdf file.
This exercise is part of the course
Big Data Analysis with Revolution R Enterprise
Exercise instructions
The first step in this problem is to declare the file paths that point to where the data is being stored on the server.
The file.path() function constructs a path to a file, and in this problem you must define the big data directory, sampleDataDir, where the files are being stored.
We can get the sampleDataDir by using the rxGetOption() function.
Once we know where the files are, we can look in that directory to examine what files exist with list.files().
Once we have found the name of the file (AirlineDemoSmall.csv), we can import it using rxImport(). In this case, we will also use the argument colInfo in order to specify that the character variable DayOfWeek should be interpreted as a factor, and that its levels should have the same order as the days of the week (Monday - Sunday).
Next, let's import the data. You can use the help() or args() in order to remind yourself of the arguments to the rxImport() function.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Declare the file paths for the csv and xdf files
myAirlineCsv <- file.path(rxGetOption("___"), "AirlineDemoSmall.csv")
myAirlineXdf <- "ADS.xdf"
# Use rxImport to import the data into xdf format
rxImport(inData = ___,
___ = myAirlineXdf,
overwrite = TRUE,
colInfo = list(
DayOfWeek = list(
type = "factor",
levels = c("Monday", "Tuesday", "Wednesday",
"Thursday", "Friday", "Saturday", "Sunday")
)
)
)