LoslegenKostenlos loslegen

Creating training samples

As part of a customer service chatbot that your team is building, you are creating a pipeline to preprocess a dataset that will eventually be used to fine-tune a language model so that it can predict the intent of a customer's question and route the requests to the correct team for processing.

You are given a dataset with the customer's question and intent in separate columns, and you want to preprocess the dataset so that you have merged each example containing the question and intent into a single string with your formatted prompt.

The dataset is already loaded in dataset. The dataset contains the columns instruction with the customer question, and intent for the user's intent.

Diese Übung ist Teil des Kurses

Fine-Tuning with Llama 3

Kurs anzeigen

Anleitung zur Übung

  • Create a prompt string with the instruction and intent in the form "Query: {instruction}\nIntent: {intent}".
  • Fill out the function call with the dataset to apply the create_intent_example to each row.
  • Extract and print out the value in the intent_example column in the first row of the dataset.

Interaktive Übung

Versuche dich an dieser Übung, indem du diesen Beispielcode vervollständigst.

def create_intent_example(row):
    # Fill out the columns in the prompt
    row['intent_example'] = ____
    return row

# Call the ds method to apply our preprocessing function to all rows
processed_dataset = dataset.____(____)
# Print the intent_example in the first row of the processed data
print(processed_dataset[____][____])
Code bearbeiten und ausführen