Exercise

Prepare for 8-bit Training

You wanted to begin RLHF fine-tuning, but you kept running into out-of-memory errors. To address this, you decided to switch to 8-bit precision, which allows for more efficient fine-tuning, by leveraging the Hugging Face peft library.

The following have been pre-imported:

AutoModelForCausalLM from transformers
prepare_model_for_int8_training from peft
AutoModelForCausalLMWithValueHead from trl

Instructions

100 XP

Load the pre-trained model and make sure to include the parameter for 8-bit precision.
Use the prepare_model_for_int8_training function to make the model ready for LoRA-based fine-tuning.
Load the model with a value head for PPO training.

.css-6su6fj{-webkit-flex-shrink:0;-ms-flex-negative:0;flex-shrink:0;}Exercise

Instructions

Exercise