Exercise

Train with LoRA

You wanted to begin RLHF fine-tuning but kept encountering out-of-memory errors. Although you switched to loading the model in 8-bit precision, the error persisted. To address this, you decided to take the next step and apply LoRA for more efficient fine-tuning.

The following have already been pre-imported:

The model loaded in 8-bit precision as pretrained_model_8bit
LoraConfig and get_peft_model from peft
AutoModelForCausalLMWithValueHead from trl

Instructions

100 XP

Set the LoRA droupout to 0.1 and the bias type to be lora-only.
Add the LoRA configuration to the model.
Set up the model with a value head for PPO training.

.css-6su6fj{-webkit-flex-shrink:0;-ms-flex-negative:0;flex-shrink:0;}Exercise

Instructions

Exercise