Exercise

Local SGD with Accelerator

You've implemented gradient accumulation and gradient checkpointing to streamline memory usage for your language translation model. Training is still a bit slow, so you decide to add local SGD to your training loop to improve communication efficiency between devices. Build the training loop with local SGD!

The model, train_dataloader, and accelerator have been pre-defined, and LocalSGD has been imported.

Instructions

100 XP

Set up a context manager for local SGD, and synchronize gradients every eight steps.
Step the local SGD context manager.

.css-6su6fj{-webkit-flex-shrink:0;-ms-flex-negative:0;flex-shrink:0;}Exercise

Instructions

Exercise