Session Ready
Exercise

Cleaning text data

Now that you've defined the stopwords and punctuations, let's use these to clean our enron emails in the dataframe df further. The lists containing stopwords and punctuations are available under stop and exclude There are a few more steps to take before you have cleaned data, such as "lemmatization" of words, and stemming the verbs. The verbs in the email data are already stemmed, and the lemmatization is already done for you in this exercise.

Instructions 1/2
undefined XP
  • 1
  • 2
  • Use the previously defined variables stop and exclude to finish of the function: Strip the words from whitespaces using rstrip, and exclude stopwords and punctuations. Finally lemmatize the words and assign that to normalized.