Majority voting on multiple data sources
Your team is developing an AI model to automatically generate smartphone quality control (QC) reports. For this purpose, you've collected preference data from three different quality control sources - an "Automated Vision System," a "Human Inspector," and "Customer Feedback". They've each labeled paired text samples as 'chosen' and 'rejected.' Each pair has a unique 'id', and every entry showcases a preferred QC review.
quality_df
is a combined DataFrame
loaded using pandas
. It contains data from the three different data sources. Additionally, the Counter
class has been pre-imported from the collections
module.
Este ejercicio forma parte del curso
Reinforcement Learning from Human Feedback (RLHF)
Instrucciones del ejercicio
- Count the occurrences of each (chosen, rejected) pairs in the vote function.
- Find the (chosen, rejected) pair with the highest vote count.
Ejercicio interactivo práctico
Prueba este ejercicio y completa el código de muestra.
def majority_vote(df):
# Count occurrences of each (chosen, rejected) pair
votes = ____
# Find the (chosen, rejected) pair with the highest vote count
winner = ____
return winner
final_preferences = quality_df.groupby(['id']).apply(majority_vote)
print(final_preferences)