K-means for feedback clustering
You have a dataset of feedback responses, and you've used a GPT model to calculate confidence scores for each response. To identify unusual or outlier feedback, you apply k-means clustering to the low-confidence responses.
The KMeans algorithm, reviews and confidences variables, and np library have been preloaded.
Cet exercice fait partie du cours
Reinforcement Learning from Human Feedback (RLHF)
Instructions
- Initialize the k-means algorithm. Set the
random_stateto42for code reproducibility. - Calculate distances from cluster centers to identify outliers as the difference between
dataand the corresponding cluster centers.
Exercice interactif pratique
Essayez cet exercice en complétant cet exemple de code.
def detect_anomalies(data, n_clusters=3):
# Initialize k-means
____
clusters = kmeans.fit_predict(data)
centers = kmeans.cluster_centers_
# Calculate distances from cluster centers
____
return distances
anomalies = detect_anomalies(confidences)
print(anomalies)