Higher order n-grams for sentiment analysis
Similar to a previous exercise, we are going to build a classifier that can detect if the review of a particular movie is positive or negative. However, this time, we will use n-grams up to n=2 for the task.
The n-gram training reviews are available as X_train_ng
. The corresponding test reviews are available as X_test_ng
. Finally, use y_train
and y_test
to access the training and test sentiment classes respectively.
Cet exercice fait partie du cours
Feature Engineering for NLP in Python
Instructions
- Define an instance of MultinomialNB. Name it
clf_ng
- Fit the classifier on
X_train_ng
andy_train
. - Measure
accuracy
onX_test_ng
andy_test
the usingscore()
method.
Exercice interactif pratique
Essayez cet exercice en complétant cet exemple de code.
# Define an instance of MultinomialNB
clf_ng = ____
# Fit the classifier
clf_ng.____(____, ____)
# Measure the accuracy
accuracy = ____
print("The accuracy of the classifier on the test set is %.3f" % accuracy)
# Predict the sentiment of a negative review
review = "The movie was not good. The plot had several holes and the acting lacked panache."
prediction = clf_ng.predict(ng_vectorizer.transform([review]))[0]
print("The sentiment predicted by the classifier is %i" % (prediction))