Prepare datasets for distributed training
You've preprocessed a dataset for a precision agriculture system to help farmers monitor crop health. Now you'll load the data by creating a DataLoader and place the data on GPUs for distributed training, if GPUs are available. Note the exercise actually uses a CPU, but the code is the same for CPUs and GPUs.
Some data has been pre-loaded:
- A sample
dataset
with agricultural imagery - The
Accelerator
class from theaccelerate
library - The
DataLoader
class
This exercise is part of the course
Efficient AI Model Training with PyTorch
Exercise instructions
- Create a
dataloader
for the pre-defineddataset
. - Place the
dataloader
on available devices using theaccelerator
object.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
accelerator = Accelerator()
# Create a dataloader for the pre-defined dataset
dataloader = ____(____, batch_size=32, shuffle=True)
# Place the dataloader on available devices
dataloader = accelerator.____(____)
print(accelerator.device)