Comparing models on labeled review data
Now that you can classify sentiment in bulk, your team wants to evaluate which model is more reliable. You'll compare two models using a larger labeled dataset of reviews and measure their accuracy.
A texts
list and its true_labels
are pre-loaded for you.
This exercise is part of the course
Natural Language Processing (NLP) in Python
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
from transformers import pipeline
from sklearn.metrics import accuracy_score
# Load sentiment analysis models
pipe_a = pipeline(task="sentiment-analysis", ____)
pipe_b = pipeline(task="sentiment-analysis", ____)
# Generate predictions
preds_a = [____ for res in pipe_a(texts)]
preds_b = [____ for res in pipe_b(texts)]