CommencerCommencer gratuitement

Low confidence

In this exercise, you'll be working with a reward model to assess how confidently it classifies input text and to filter out predictions that lack reliability. The goal is to evaluate the model's ability to generate predictions and to apply a confidence threshold to ensure that only high-confidence predictions are considered valid.

The probability distributions for each feedback text (prob_dists) and feedback text (texts) variables, and least_confidence() function have been loaded.

Cet exercice fait partie du cours

Reinforcement Learning from Human Feedback (RLHF)

Afficher le cours

Instructions

  • Define the function to filter the indices of probability distributions for which the confidence is below a given threshold.
  • Get the indices of the feedback comments by passing the probability distributions to the function, leaving the threshold unchanged (0.5).

Exercice interactif pratique

Essayez cet exercice en complétant cet exemple de code.

# Define the filter function
def filter_low_confidence_predictions(prob_dists, threshold=0.5):
    filtered_indices = [i for i, ____ in enumerate(____) ____]
    return filtered_indices

# Find the indices
filtered_indices = ____

high_confidence_texts = [texts[i] for i in filtered_indices]
print("High-confidence texts:", high_confidence_texts)
Modifier et exécuter le code