Gradient checkpointing with Accelerator
You're continuing to optimize memory usage so you can train your language translation model on your device. Gradient accumulation has helped you to effectively train on larger batch sizes. Build on this work to add gradient checkpointing to reduce the memory footprint of your model.
The model, train_dataloader, and accelerator have been pre-defined.
Bu egzersiz
Efficient AI Model Training with PyTorch
kursunun bir parçasıdırEgzersiz talimatları
- Enable gradient checkpointing on the
model. - Set up an
Acceleratorcontext manager to enable gradient accumulation on themodel.
Uygulamalı interaktif egzersiz
Bu örnek kodu tamamlayarak bu egzersizi bitirin.
# Enable gradient checkpointing on the model
____.____()
for batch in train_dataloader:
with accelerator.accumulate(model):
inputs, targets = batch["input_ids"], batch["labels"]
# Get the outputs from a forward pass of the model
____ = ____(____, labels=targets)
loss = outputs.loss
accelerator.backward(loss)
optimizer.step()
lr_scheduler.step()
optimizer.zero_grad()
print(f"Loss = {loss}")