1. Learn
  2. /
  3. Courses
  4. /
  5. Intermediate Deep Learning with PyTorch

Connected

Exercise

PyTorch Dataset

Time to refresh your PyTorch Datasets knowledge!

Before model training can commence, you need to load the data and pass it to the model in the right format. In PyTorch, this is handled by Datasets and DataLoaders. Let's start with building a PyTorch Dataset for our water potability data.

In this exercise, you will define a class called WaterDataset to load the data from a CSV file. To do this, you will need to implement the three methods which PyTorch expects a Dataset to have:

  • .__init__() to load the data,
  • .__len__() to return data size,
  • .__getitem()__ to extract features and label for a single sample.

The following imports that you need have already been done for you:

import pandas as pd
from torch.utils.data import Dataset

Instructions 1/3

undefined XP
    1
    2
    3
  • In the .__init__() method, load the data from csv_path to a pandas DataFrame and assign it to df.
  • Convert df to a NumPy array and assign the result to self.data.