Get startedGet started for free

Extracting columns from financial time series

1. Extracting columns from financial time series

Welcome back! In the last chapter, you learned the basics of importing data from Internet sources using getSymbols(), and how to find symbols for data series. Now that you've learned how get data into your R session, let's look at some functions that make it easier to manipulate financial market time-series data.

2. OHLC

Tick data are large and often expensive to obtain, so financial market data are often provided in an aggregated format called OHLC, which stands for open, high, low, and close. The open and close are the first and last observed prices for an interval. The high and low are the largest and smallest observed prices during the interval. There is also often a volume column, which contains the sum of all contracts or shares traded in the interval.

3. OHLC data

Here's an example of some OHLC data in an object named "DC". This object contains simulated DataCamp stock prices created by randomizing some real financial market data and aggregating it to daily intervals. It is similar to objects created by getSymbols(): it's an xts object with open, high, low, close, and volume columns.

4. Single-column extractor functions

The quantmod package provides several functions that make it easy to extract a single column of data from objects containing OHLC and volume data. Their names are easy to remember, because they're simply the first two letters of the column you want to extract. For example, Cl() for close, or Vo() for volume. There's also an Ad() function for extracting the adjusted close provided by Yahoo Finance.

5. Single-column extractor functions

Let's look at a couple examples. You can see that the Op() function returns an object that contains only the open column, and the Hi() function only returns the high column. And you did not have to worry about the name of the symbol in the column name. That is because these functions work by using regular expressions on the column names. But that also means they might return more than one column if multiple columns contain the matching word.

6. Multi-column extractor functions

The quantmod package also provides functions to extract multiple columns from an OHLC object. These extractor functions are helpful when you need to pass an object containing a set of columns to another function. You can use the OHLC() function to extract only open, high, low, and close columns. This is helpful if a function only needs OHLC data, and not volume.

7. getPrice()

Now, what if you actually have raw tick data? Or your object contains data for multiple symbols? The quantmod getPrice() function can help you extract columns in these cases. getPrice() has three arguments: "x" the object that contains the data, "symbol" an optional symbol (in case "x" contains data for multiple symbols), and "prefer" an optional preferred price. If "prefer" is not specified, getPrice() will look for "price", "trade", and "close", in that order.

8. Extract other columns using getPrice()

Here you see some tick data for DataCamp stock. The price and volume columns are the price and number of shares traded. The "bid" is the price and number of shares you could sell, while the "ask" is what you could buy from a market maker. If you want to extract the bid price column, you can do that with getPrice() by setting: prefer = "bid".

9. Let's practice!

You've learned how to extract specific columns from different types of financial data, and now it's time to practice using them!