1. Learn
  2. /
  3. Courses
  4. /
  5. Feature Engineering for Machine Learning in Python

Exercise

High level text features

Once the text has been cleaned and standardized you can begin creating features from the data. The most fundamental information you can calculate about free form text is its size, such as its length and number of words. In this exercise (and the rest of this chapter), you will focus on the cleaned/transformed text column (text_clean) you created in the last exercise.

Instructions

100 XP
  • Record the character length of each speech in the char_count column.
  • Record the word count of each speech in the word_count column.
  • Record the average word length of each speech in the avg_word_length column.