Session Ready
Exercise

Airline sentiment with stop words

You are given a dataset, called tweets, which contains customers' reviews and sentiments about airlines. It consists of two columns: airline_sentiment and text where the sentiment can be positive, negative or neutral, and the text is the text of the tweet.

In this exercise, you will create a BOW representation but will account for the stop words. Remember that stop words are not informative and you might want to remove them. That will result in a smaller vocabulary and eventually, fewer features. Keep in mind that we can enrich a default list of stop words with ones that are specific to our context.

Instructions
100 XP
  • Import the default list of English stop words.
  • Update the default list of stop words with the given list ['airline', 'airlines', '@'] to create my_stop_words.
  • Specify the stop words argument in the vectorizer.