Get startedGet started for free

Generating sequences

To be able to train neural networks on sequential data, you need to pre-process it first. You'll chunk the data into inputs-target pairs, where the inputs are some number of consecutive data points and the target is the next data point.

Your task is to define a function to do this called create_sequences(). As inputs, it will receive data stored in a DataFrame, df and seq_length, the length of the inputs. As outputs, it should return two NumPy arrays, one with input sequences and the other one with the corresponding targets.

As a reminder, here is how the DataFrame df looks like:

                 timestamp  consumption
0      2011-01-01 00:15:00    -0.704319
...                    ...          ...
140255 2015-01-01 00:00:00    -0.095751

This exercise is part of the course

Intermediate Deep Learning with PyTorch

View Course

Exercise instructions

  • Iterate over the range of the number of data points minus the length of an input sequence.
  • Define the inputs x as the slice of df from the ith row to the i + seq_lengthth row and the column at index 1.
  • Define the target y as the slice of df at row index i + seq_length and the column at index 1.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

import numpy as np

def create_sequences(df, seq_length):
    xs, ys = [], []
    # Iterate over data indices
    for i in range(____):
      	# Define inputs
        x = df.iloc[____, ____]
        # Define target
        y = df.iloc[____, ____]
        xs.append(x)
        ys.append(y)
    return np.array(xs), np.array(ys)
Edit and Run Code