Session Ready
Exercise

AFINN: I'm your Huckleberry

Now we transition to the AFINN lexicon. The AFINN lexicon has numeric values from 5 to -5, not just positive or negative. Unlike the Bing lexicon's sentiment, the AFINN lexicon's sentiment score column is called value.

As before, you apply inner_join() then count(). Next, to sum the scores of each line, we use dplyr's group_by() and summarize() functions. The group_by() function takes an existing data frame and converts it into a grouped data frame where operations are performed "by group". Then, the summarize() function lets you calculate a value for each group in your data frame using a function that aggregates data, like sum() or mean(). So, in our case we can do something like

data_frame %>% 
    group_by(book_line) %>% 
    summarize(total_score = sum(book_line))

In the tidy version of Huckleberry Finn, line 9703 contains words "best", "ever", "fun", "life" and "spirit". "best" and "fun" have AFINN scores of 3 and 4 respectively. After aggregating, line 9703 will have a total score of 7.

In the tidyverse, filter() is preferred to subset() because it combines the functionality of subset() with simpler syntax. Here is an example that filter()s data_frame where some value in column1 is equal to 24. Notice the column name is not in quotes.

filter(data_frame, column1 == 24)

The afinn object contains the AFINN lexicon. The huck object is a tidy version of Mark Twain's Adventures of Huckleberry Finn for analysis.

Line 5400 is All the loafers looked glad; I reckoned they was used to having fun out of Boggs. Stopwords and punctuation have already been removed in the dataset.

Instructions 1/3
undefined XP
  • 1
  • 2
  • 3
  • Run the code to look at line 5400, and see the sentiment scores of some words.
  • inner_join() huck to the afinn lexicon.
    • Remember huck is already piped into the function so just add the lexicon.
    • Join by the term column in the text and the word column in the lexicon.
  • Use count() with value and line to tally/count observations by group.
    • Assign the result to huck_afinn.