Build and assess a model: product reviews data

In this exercise, you will build a logistic regression using the reviews dataset, containing customers' reviews of Amazon products. The array y contains the sentiment : 1 if positive and 0 otherwise. The array X contains all numeric features created using a BOW approach. Feel free to explore them in the IPython Shell.

Your task is to build a logistic regression model and calculate the accuracy and confusion matrix using the test dataset.

The logistic regression and train/test splitting functions have been imported for you.

Import the accuracy score and confusion matrix functions.
Split the data into training and testing, using 30% of it as a test set and set the random seed to 42.
Train a logistic regression model.
Print out the accuracy score and confusion matrix using the test data.

Sentiment Analysis Nuts and Bolts

Numeric Features from Reviews

More on Numeric Vectors: Transforming Tweets

Let's Predict the Sentiment

Exercise

Build and assess a model: product reviews data

Instructions