1. Importing text files
Welcome to chapter 5! You've learned how to find and download data from the Internet, extract and transform the data you load, configure your environment to manage multiple data sources, and align series with different periodicities.
2. getSymbols() with CSV files
In this video, you will learn how to import data from plain text files. At some point, you will probably need to import data that aren't available via any of the functions you've learned so far. Most data providers and software vendors provide a way to export data to CSV. In fact, that's how Yahoo finance and Google finance provide data.
You can use getSymbols() to import CSV data, if the data are suitably formatted. Each CSV file can only contain one instrument, and must include certain columns in a specific order. The file names must be the name of the instrument, with a dot-CSV extension. You can use the 'dir' argument to specify which directory all the data files are in, if they're not in your working directory.
3. getSymbols() with CSV files
In this example, there's some Amazon data in the AMZN-dot-CSV file. You can see that the file has the necessary columns in the necessary order. You'll also notice that it looks a lot like the data getSymbols() returns from Yahoo and Google Finance. Importing the data is as easy as calling getSymbols() with the AMZN symbol and src set to CSV.
4. read.zoo()
Don't worry if your text file isn't in this specific format. The read-dot-zoo() function in the zoo package provides a way to import text files directly into a zoo object. You can use read-dot-zoo() to import the same Amazon CSV file. read-dot-zoo() is a wrapper around read-dot-table(), so you need to specify the sep argument as a comma, and set header equal to true. If you want an xts object instead of a zoo object, you can simply call as-dot-xts() on the object returned by read-dot-zoo().
5. Date and time in separate columns
One of the problems with data in text files is the various formats the data may be in. Let's look at examples of importing data from a couple common formats. The first is when the date and the time are in separate columns. You can handle this with read-dot-zoo() by passing a vector of column names to the index-column argument. read-dot-zoo() will automatically combine those two columns to create a POSIXct index.
6. File contains multiple instruments
You may also encounter data in long format, with multiple instruments in a single file. In this example, there is a column to indicate which symbol the price is for, and another column to indicate the type of price. You can also handle this file format with read-dot-zoo(), by using the split argument to provide the column names that identify the variables for the wide format.
7. Let's practice!
Now that you've seen some examples, it's time to practice!