Where can you observe Zipf's law?
Although Zipf observed a steep and predictable decline in word usage you may not buy into Zipf's law. You may be thinking "I know plenty of words, and have a distinctive vocabulary". That may be the case, but the same can't be said for most people! To prove it, let's construct a visual from 3 million tweets mentioning "#sb". Keep in mind that the visual doesn't follow Zipf's law perfectly, the tweets all mentioned the same hashtag so it is a bit skewed. That said, the visual you will make follows a steep decline showing a small lexical diversity among the millions of tweets. So there is some science behind using lexicons for natural language analysis!
In this exercise, you will use the package metricsgraphics. Although the author suggests using the pipe %>% operator, you will construct the graphic step-by-step to learn about the various aspects of the plot. The main function of the package metricsgraphics is the mjs_plot() function which is the first step in creating a JavaScript plot. Once you have that, you can add other layers on top of the plot.
An example metricsgraphics workflow without using the %>% operator is below:
metro_plot <- mjs_plot(data, x = x_axis_name, y = y_axis_name, show_rollover_text = FALSE)
metro_plot <- mjs_line(metro_plot)
metro_plot <- mjs_add_line(metro_plot, line_one_values)
metro_plot <- mjs_add_legend(metro_plot, legend = c('names', 'more_names'))
metro_plot
Este exercício faz parte do curso
Sentiment Analysis in R
Instruções do exercício
- Use
head()onsb_wordsto review top words. - Create a new column
expectationsby dividing the largest word frequency,freq[1], by therankcolumn. - Start
sb_plotusingmjs_plot().- Pass in
sb_wordswithx = rankandy = freq. - Within
mjs_plot()setshow_rollover_texttoFALSE.
- Pass in
- Overwrite
sb_plotusingmjs_line()and pass insb_plot. - Add to
sb_plotwithmjs_add_line().- Pass in the previous
sb_plotobject and the vector,expectations.
- Pass in the previous
- Place a legend on a new
sb_plotobject usingmjs_add_legend().- Pass in the previous
sb_plotobject - The legend labels should consist of
"Frequency"and"Expectation".
- Pass in the previous
- Call
sb_plotto display the plot. Mouseover a point to simultaneously highlight afreqandExpectationpoint. The magic of JavaScript!
Exercício interativo prático
Experimente este exercício completando este código de exemplo.
# Examine sb_words
___
# Create expectations
sb_words$expectations <- sb_words %$%
{___[___] / ___}
# Create metrics plot
sb_plot <- ___(___, x = ___, y = ___, ___ = ___)
# Add 1st line
sb_plot <- ___(___)
# Add 2nd line
sb_plot <- ___(___, ___)
# Add legend
sb_plot <- ___(___, legend = c("___", "___"))
# Display plot
sb_plot