Predicting the sentiment of a movie review

In the previous exercise, you generated the bag-of-words representations for the training and test movie review data. In this exercise, we will use this model to train a Naive Bayes classifier that can detect the sentiment of a movie review and compute its accuracy. Note that since this is a binary classification problem, the model is only capable of classifying a review as either positive (1) or negative (0). It is incapable of detecting neutral reviews.

In case you don't recall, the training and test BoW vectors are available as X_train_bow and X_test_bow respectively. The corresponding labels are available as y_train and y_test respectively. Also, for you reference, the original movie review dataset is available as df.

Instantiate an object of MultinomialNB. Name it clf.
Fit clf using X_train_bow and y_train.
Measure the accuracy of clf using X_test_bow and y_test.

Basic features and readability scores

Text preprocessing, POS tagging and NER

N-Gram models

TF-IDF and similarity scores

Exercise

Predicting the sentiment of a movie review

Instructions