Exercise

Noun usage in fake news

In this exercise, you have been given a dataframe headlines that contains news headlines that are either fake or real. Your task is to generate two new features num_propn and num_noun that represent the number of proper nouns and other nouns contained in the title feature of headlines.

Next, we will compute the mean number of proper nouns and other nouns used in fake and real news headlines and compare the values. If there is a remarkable difference, then there is a good chance that using the num_propn and num_noun features in fake news detectors will improve its performance.

To accomplish this task, the functions proper_nouns and nouns that you had built in the previous exercise have already been made available to you.

Instructions 1/2

undefined XP
  • 1
    • Create a new feature num_propn by applying proper_nouns to headlines['title'].
    • Filter headlines to compute the mean number of proper nouns in fake news using the mean method.
  • 2
    • Repeat the process for other nous: create a feature 'num_noun' using nouns and compute the mean of other nouns