Generating sequences
To be able to train neural networks on sequential data, you need to pre-process it first. You'll chunk the data into inputs-target pairs, where the inputs are some number of consecutive data points and the target is the next data point.
Your task is to define a function to do this called create_sequences()
. As inputs, it will receive data stored in a DataFrame, df
and seq_length
, the length of the inputs. As outputs, it should return two NumPy arrays, one with input sequences and the other one with the corresponding targets.
As a reminder, here is how the DataFrame df
looks like:
timestamp consumption
0 2011-01-01 00:15:00 -0.704319
... ... ...
140255 2015-01-01 00:00:00 -0.095751
This exercise is part of the course
Intermediate Deep Learning with PyTorch
Exercise instructions
- Iterate over the range of the number of data points minus the length of an input sequence.
- Define the inputs
x
as the slice ofdf
from thei
th row to thei + seq_length
th row and the column at index1
. - Define the target
y
as the slice ofdf
at row indexi + seq_length
and the column at index1
.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
import numpy as np
def create_sequences(df, seq_length):
xs, ys = [], []
# Iterate over data indices
for i in range(____):
# Define inputs
x = df.iloc[____, ____]
# Define target
y = df.iloc[____, ____]
xs.append(x)
ys.append(y)
return np.array(xs), np.array(ys)