Aan de slagGa gratis aan de slag

Gradient checkpointing with Accelerator

You're continuing to optimize memory usage so you can train your language translation model on your device. Gradient accumulation has helped you to effectively train on larger batch sizes. Build on this work to add gradient checkpointing to reduce the memory footprint of your model.

The model, train_dataloader, and accelerator have been pre-defined.

Deze oefening maakt deel uit van de cursus

Efficient AI Model Training with PyTorch

Cursus bekijken

Oefeninstructies

  • Enable gradient checkpointing on the model.
  • Set up an Accelerator context manager to enable gradient accumulation on the model.

Praktische interactieve oefening

Probeer deze oefening eens door deze voorbeeldcode in te vullen.

# Enable gradient checkpointing on the model
____.____()

for batch in train_dataloader:
    with accelerator.accumulate(model):
        inputs, targets = batch["input_ids"], batch["labels"]
        # Get the outputs from a forward pass of the model
        ____ = ____(____, labels=targets)
        loss = outputs.loss
        accelerator.backward(loss)
        optimizer.step()
        lr_scheduler.step()
        optimizer.zero_grad()
        print(f"Loss = {loss}")
Code bewerken en uitvoeren