K-means for feedback clustering
You have a dataset of feedback responses, and you've used a GPT model to calculate confidence scores for each response. To identify unusual or outlier feedback, you apply k-means clustering to the low-confidence responses.
The KMeans
algorithm, reviews
and confidences
variables, and np
library have been preloaded.
Este exercício faz parte do curso
Reinforcement Learning from Human Feedback (RLHF)
Instruções do exercício
- Initialize the k-means algorithm. Set the
random_state
to42
for code reproducibility. - Calculate distances from cluster centers to identify outliers as the difference between
data
and the corresponding cluster centers.
Exercício interativo prático
Experimente este exercício completando este código de exemplo.
def detect_anomalies(data, n_clusters=3):
# Initialize k-means
____
clusters = kmeans.fit_predict(data)
centers = kmeans.cluster_centers_
# Calculate distances from cluster centers
____
return distances
anomalies = detect_anomalies(confidences)
print(anomalies)