Get startedGet started for free

Handling instrument symbols that clash or are not valid R names

1. Handling instrument symbols that clash or are not valid R names

Welcome back! In this video, you will learn about an issue you will eventually run into: importing a dataset where the symbol is not a syntactically valid R name. What is a syntactically valid R name?

2. Syntactically valid names

Syntactically valid names can contain letters, numbers, period (or dot), and underscore. They must start with a letter or a period followed by a non-number. It's actually possible to create R objects with syntactically invalid names, but they have to be accessed in unconventional ways.

3. Accessing objects with non-syntactic names (1)

Not all ticker symbols are syntactically valid names. For example, most index symbols on Yahoo Finance start with a circumflex, like the S&P 500 index. Since R objects with syntactically invalid names can be difficult to use, getSymbols() automatically removes the circumflex from the object it creates and from the object's column names.

4. Accessing objects with non-syntactic names (2)

But getSymbols() doesn't always convert the ticker symbol and column names to syntactically valid names, as you can see for the Shanghai Stock Exchange Composite Index. You will get an error if you try to use this object as if it had a valid name.

5. Accessing objects with non-syntactic names (3)

If getSymbols() loads data into your workspace, and the object does not have a syntactically valid name, the two easiest ways to access the returned object are to surround the object name with backticks, or use the get() function.

6. Valid name for one instrument

If you know the ticker symbol is not a syntactically valid name, you can set auto-dot-assign = FALSE in your getSymbols() call so you can assign the object to a syntactically valid name directly. In this case, the column names will probably not be syntactically valid names either. This might be a problem if you need to use a function that expects the column names to be valid R names. You can use paste() to create syntactically valid column names, and then use the colnames() assignment function to assign them to the object.

7. Valid name for multiple instruments

Earlier in this chapter, you used the setSymbolLookup() function to change some default argument values to getSymbols() on a symbol-by-symbol basis. You can also use setSymbolLookup() to map a data source's ticker symbol to a syntactically valid R object name. For example, you can map the Shanghai Stock Exchange Composite Index ticker symbol to "SSE". This is also useful for ticker symbols that clash with other R object names. For example, the ticker symbol for Ford Motor Company is "F", which also contains the logical false value. Recall that setSymbolLookup() takes symbol-argument pairs, where the argument portion can be a list of arguments. To map a ticker symbol to a new name, the "symbol" portion should be the syntactically valid R object name. Then the "arguments" portion should contain a named list, where you set "name" equal to the data-source ticker symbol. Now you can call getSymbols() with the new, valid name! Remember that setSymbolLookup() stores information in R's global options(), so they do not persist across R sessions, but you can use saveSymbolLookup() and loadSymbolLookup() to manage and share your symbol-based defaults.

8. Let's practice!

Let's practice working through the difficulties of instrument symbols that aren't valid R names.