K-means for feedback clustering
You have a dataset of feedback responses, and you've used a GPT model to calculate confidence scores for each response. To identify unusual or outlier feedback, you apply k-means clustering to the low-confidence responses.
The KMeans
algorithm, reviews
and confidences
variables, and np
library have been preloaded.
Cet exercice fait partie du cours
Reinforcement Learning from Human Feedback (RLHF)
Instructions
- Initialize the k-means algorithm. Set the
random_state
to42
for code reproducibility. - Calculate distances from cluster centers to identify outliers as the difference between
data
and the corresponding cluster centers.
Exercice interactif pratique
Essayez cet exercice en complétant cet exemple de code.
def detect_anomalies(data, n_clusters=3):
# Initialize k-means
____
clusters = kmeans.fit_predict(data)
centers = kmeans.cluster_centers_
# Calculate distances from cluster centers
____
return distances
anomalies = detect_anomalies(confidences)
print(anomalies)