Preparing the preference dataset
In this exercise, you'll work with a dataset which contains human feedback in the form of "chosen" and "rejected" outputs. Your task is to extract the prompts from the "chosen" column and prepare the data for training a reward model.
The load_dataset
function from datasets
has been pre-imported
This exercise is part of the course
Reinforcement Learning from Human Feedback (RLHF)
Exercise instructions
- Load the
trl-internal-testing/hh-rlhf-helpful-base-trl-style
dataset from Hugging Face. - Write a function that extracts the prompt from the
'content'
field, assuming that the prompt is found at the0
index of the input to the function. - Apply the function that extracts the prompt to the
'chosen'
dataset subset.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Load the dataset
preference_data = ____
# Define a function to extract the prompt
def extract_prompt(text):
____
return prompt
# Apply the function to the dataset
preference_data_with_prompt = ____(
lambda sample: {**sample, 'prompt': ____(sample['chosen'])}
)
sample = preference_data_with_prompt.select(range(1))
print(sample['prompt'])