LoslegenKostenlos loslegen

Majority voting on multiple data sources

Your team is developing an AI model to automatically generate smartphone quality control (QC) reports. For this purpose, you've collected preference data from three different quality control sources - an "Automated Vision System," a "Human Inspector," and "Customer Feedback". They've each labeled paired text samples as 'chosen' and 'rejected.' Each pair has a unique 'id', and every entry showcases a preferred QC review.

quality_df is a combined DataFrame loaded using pandas. It contains data from the three different data sources. Additionally, the Counter class has been pre-imported from the collections module.

Diese Übung ist Teil des Kurses

Reinforcement Learning from Human Feedback (RLHF)

Kurs anzeigen

Anleitung zur Übung

  • Count the occurrences of each (chosen, rejected) pairs in the vote function.
  • Find the (chosen, rejected) pair with the highest vote count.

Interaktive Übung

Vervollständige den Beispielcode, um diese Übung erfolgreich abzuschließen.

def majority_vote(df):
  	# Count occurrences of each (chosen, rejected) pair
    votes = ____
    # Find the (chosen, rejected) pair with the highest vote count
    winner = ____
    return winner

final_preferences = quality_df.groupby(['id']).apply(majority_vote)

print(final_preferences)
Code bearbeiten und ausführen