K-means for feedback clustering
You have a dataset of feedback responses, and you've used a GPT model to calculate confidence scores for each response. To identify unusual or outlier feedback, you apply k-means clustering to the low-confidence responses.
The KMeans
algorithm, reviews
and confidences
variables, and np
library have been preloaded.
Este ejercicio forma parte del curso
Reinforcement Learning from Human Feedback (RLHF)
Instrucciones del ejercicio
- Initialize the k-means algorithm. Set the
random_state
to42
for code reproducibility. - Calculate distances from cluster centers to identify outliers as the difference between
data
and the corresponding cluster centers.
Ejercicio interactivo práctico
Prueba este ejercicio y completa el código de muestra.
def detect_anomalies(data, n_clusters=3):
# Initialize k-means
____
clusters = kmeans.fit_predict(data)
centers = kmeans.cluster_centers_
# Calculate distances from cluster centers
____
return distances
anomalies = detect_anomalies(confidences)
print(anomalies)