Text generation with RLHF
In this exercise, you will work with a model pre-trained with RLHF named lvwerra/gpt2-imdb-pos-v2
. This exercise is a chance to review constructing a Hugging Face pipeline and use it to test a use case for RLHF-trained models: generating movie reviews.
The pipeline, AutoModelForCausalLM, and AutoTokenizer
objects have been pre-imported from transformers
. The tokenizer
has been pre-loaded
This exercise is part of the course
Reinforcement Learning from Human Feedback (RLHF)
Exercise instructions
- Set the model name to
lvwerra/gpt2-imdb-pos-v2
, the RLHF-pretrained model. - Use the
pipeline
function to create atext-generation
pipeline. - Use the text generation pipeline to generate a continuation of the review provided.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Set the model name
model_name = ____
model = AutoModelForCausalLM.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)
# Create a text generation pipeline
text_generator = pipeline(____, model=model, tokenizer=tokenizer)
review_prompt = "Surprisingly, the film"
# Generate a continuation of the review
generated_text = text_generator(____, max_length=10)
print(f"Generated Review Continuation: {generated_text[0]['generated_text']}")