High Frequency Stock Prices
Higher frequency stock data is well modeled by an MA(1) process, so it's a nice application of the models in this chapter.
The DataFrame intraday
contains one day's prices (on September 1, 2017) for Sprint stock (ticker symbol "S") sampled at a frequency of one minute. The stock market is open for 6.5 hours (390 minutes), from 9:30am to 4:00pm.
Before you can analyze the time series data, you will have to clean it up a little, which you will do in this and the next two exercises. When you look at the first few rows, you'll notice several things. First, there are no column headers.The data is not time stamped from 9:30 to 4:00, but rather goes from 0 to 390. And you will notice that the first date is the odd-looking "a1504272600". The number after the "a" is Unix time which is the number of seconds since January 1, 1970. This is how this dataset separates each day of intraday data.
If you look at the data types, you'll notice that the DATE
column is an object, which here means a string. You will need to change that to numeric before you can clean up some missing data.
The source of the minute data is Google Finance (see here on how the data was downloaded).
The datetime
module has already been imported for you.
This exercise is part of the course
Time Series Analysis in Python
Exercise instructions
- Manually change the first date to zero using
.iloc[0,0]
. - Change the two column headers to
'DATE'
and'CLOSE'
by settingintraday.columns
equal to a list containing those two strings. - Use the pandas attribute
.dtypes
(no parentheses) to see what type of data are in each column. - Convert the
'DATE'
column to numeric using the pandas functionto_numeric()
. - Make the
'DATE'
column the new index ofintraday
by using the pandas method.set_index()
, which will take the string'DATE'
as its argument (not the entire column, just the name of the column).
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# import datetime module
import datetime
# Change the first date to zero
intraday.___ = 0
# Change the column headers to 'DATE' and 'CLOSE'
intraday.columns = ___
# Examine the data types for each column
print(intraday.dtypes)
# Convert DATE column to numeric
intraday['DATE'] = pd.to_numeric(____)
# Make the `DATE` column the new index
intraday = ____