Get startedGet started for free

Checking for weirdness

1. Checking for weirdness

Now that you've learned how to import data from plain text files, I'd like to show you how to identify and handle a couple issues you're likely to encounter while analyzing financial data: missing values and corporate actions.

2. Visualize Data

Let's start with missing values. Often, a plot is the easiest way to identify weirdness in your data. Here you see a large gap in the 10-year Treasury rate. The data are missing for Friday the 12th and Monday the 15th, which was President's Day in 1982. Stock markets were closed on President's Day, but were open on Friday. So you might want to fill the missing value in the Treasury data that is not missing in stock market data.

3. Handle missing values

The xts and zoo course taught you how to replace missing values using na-dot-locf() and na-dot-approx(). Linear approximation may create bias if your data are not fairly linear. In that case, you can use the na-dot-spline() function to perform non-linear spline interpolation, which uses multiple data points to calculate an approximation.

4. Handle missing values

In the chart, you can see that either linear or spline interpolation do a good job filling the missing values.

5. Handle missing values

The LOCF method looks out of place.

6. Visualize data

Now that you've learned a new way to fill missing values, and how to visually compare methods, let's talk about some weirdness in stock market data. Here you can see a chart of Microsoft's stock price using data from Google Finance.

7. Visualize data

There is a large one-day price change from 30 dollars per share to 27 dollars per share; a loss of nearly 10% in one day! Did that really happen?

8. Cross-reference sources

It's often helpful to cross-reference your data with another source. You can import Microsoft data from Yahoo Finance using getSymbols(), and then plot the close price. You can see it's the same as the Google Finance data. Maybe that large price change is real?

9. Cross-reference sources

But Yahoo Finance also includes an adjusted close price, and that plot doesn't have a large drop in mid-November.

10. Cross-reference sources

The adjusted close price is different because it accounts for two common corporate actions: splits and dividends. In this case, Microsoft paid a three dollar per share dividend, which caused the share price to fall by the same amount. These corporate actions affect the stock price, but not the investor's return, so the adjustments enable you to calculate the total return from owning a stock. You will learn more about the adjustment calculation in the next video.

11. Stock split example

So, what are splits and dividends? A stock split is when a company simultaneously increases the number of shares outstanding and decreases the stock price. This means the overall value of the company doesn't change. For example, a 2-for-1 stock split would give investors 2 shares for every 1 share they own, but would reduce the stock price by 50% at the same time.

12. Stock dividend example

A dividend is when a company decides to return capital to investors, often paid in cash. Unlike splits, dividends do reduce the company's value because money is actually leaving the company. This should theoretically cause the price to fall by the amount of the dividend. Even though the company's share price falls, the investor's return isn't affected, because they received the offsetting dividend payment.

13. Data source differences

As you're working through the exercises, remember that Yahoo Finance provides raw prices and an adjusted close column that accounts for splits and dividends. While Google Finance only provides prices that have been adjusted for splits. They don't account for dividends, or provide raw prices.

14. Let's practice!