Get startedGet started for free

Higher order n-grams for sentiment analysis

Similar to a previous exercise, we are going to build a classifier that can detect if the review of a particular movie is positive or negative. However, this time, we will use n-grams up to n=2 for the task.

The n-gram training reviews are available as X_train_ng. The corresponding test reviews are available as X_test_ng. Finally, use y_train and y_test to access the training and test sentiment classes respectively.

This exercise is part of the course

Feature Engineering for NLP in Python

View Course

Exercise instructions

  • Define an instance of MultinomialNB. Name it clf_ng
  • Fit the classifier on X_train_ng and y_train.
  • Measure accuracy on X_test_ng and y_test the using score() method.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Define an instance of MultinomialNB 
clf_ng = ____

# Fit the classifier 
clf_ng.____(____, ____)

# Measure the accuracy 
accuracy = ____
print("The accuracy of the classifier on the test set is %.3f" % accuracy)

# Predict the sentiment of a negative review
review = "The movie was not good. The plot had several holes and the acting lacked panache."
prediction = clf_ng.predict(ng_vectorizer.transform([review]))[0]
print("The sentiment predicted by the classifier is %i" % (prediction))
Edit and Run Code