MulaiMulai sekarang secara gratis

Train with LoRA

You wanted to begin RLHF fine-tuning but kept encountering out-of-memory errors. Although you switched to loading the model in 8-bit precision, the error persisted. To address this, you decided to take the next step and apply LoRA for more efficient fine-tuning.

The following have already been pre-imported:

  • The model loaded in 8-bit precision as pretrained_model_8bit
  • LoraConfig and get_peft_model from peft
  • AutoModelForCausalLMWithValueHead from trl

Latihan ini adalah bagian dari kursus

Reinforcement Learning from Human Feedback (RLHF)

Lihat Kursus

Petunjuk latihan

  • Set the LoRA droupout to 0.1 and the bias type to be lora-only.
  • Add the LoRA configuration to the model.
  • Set up the model with a value head for PPO training.

Latihan interaktif praktis

Cobalah latihan ini dengan menyelesaikan kode contoh berikut.

# Set the configuration parameters
config = LoraConfig(
    r=32,  
    lora_alpha=32,  
    lora_dropout=____,  
    bias=____)  

# Apply the LoRA configuration to the 8-bit model
lora_model = get_peft_model(pretrained_model_8bit, ____)
# Set up the tokenizer and model with a value head for PPO training
model = ____.from_pretrained(____)
Edit dan Jalankan Kode