Predicting the sentiment of a movie review

In the previous exercise, you generated the bag-of-words representations for the training and test movie review data. In this exercise, we will use this model to train a Naive Bayes classifier that can detect the sentiment of a movie review and compute its accuracy. Note that since this is a binary classification problem, the model is only capable of classifying a review as either positive (1) or negative (0). It is incapable of detecting neutral reviews.

In case you don't recall, the training and test BoW vectors are available as X_train_bow and X_test_bow respectively. The corresponding labels are available as y_train and y_test respectively. Also, for you reference, the original movie review dataset is available as df.

Deze oefening maakt deel uit van de cursus

Feature Engineering for NLP in Python

Cursus bekijken

Oefeninstructies

Instantiate an object of MultinomialNB. Name it clf.
Fit clf using X_train_bow and y_train.
Measure the accuracy of clf using X_test_bow and y_test.

Praktische interactieve oefening

Probeer deze oefening eens door deze voorbeeldcode in te vullen.

# Create a MultinomialNB object
clf = ____

# Fit the classifier
clf.____(____, ____)

# Measure the accuracy
accuracy = clf.score(____, ____)
print("The accuracy of the classifier on the test set is %.3f" % accuracy)

# Predict the sentiment of a negative review
review = "The movie was terrible. The music was underwhelming and the acting mediocre."
prediction = clf.predict(vectorizer.transform([review]))[0]
print("The sentiment predicted by the classifier is %i" % (prediction))

Code bewerken en uitvoeren