Session Ready
Exercise

Product reviews with regularization

In this exercise, you will work once more with the reviews dataset of Amazon product reviews. A vector of labels y contains the sentiment : 1 if positive and 0 otherwise. The matrix X contains all numeric features created using a BOW approach.

You will need to train two logistic regression models with different levels of regularization and compare how they perform on the test data. Remember that regularization is a way to control the complexity of the model. The more regularized a model is, the less flexible it is but the better it can generalize. Models with higher level of regularization are often less accurate than non-regularized ones.

Instructions
100 XP
  • Split the data into a train and test sets.
  • Train a logistic regression with regularization parameter of 1000. Train a second logistic regression with regularization parameter equal to 0.001.
  • Print the accuracy scores of both models on the test set.