1. Step 3: Text organization
In step 3, you organize your text which is when you first clean it.
2. Text organization with qdap
In this chapter we separate qdap functions into a custom qdap_clean function. qdap functions can be applied directly to a text vector, rather than a corpus object. In the qdap_clean function, x is a vector of employee reviews and the first preprocessing step uses replace_abbreviation, then replace_contraction, and so on.
3. Text organization with tm
For the tm library, we have a slightly more familiar cleaning function tm_clean. This function takes a VCorpus to first removePunctuation, stripWhitespace and finally removeWords, including common English and custom terms "Google", "amazon", and "company".
4. Cleaning your corpora
In this section you will clean 4 distinct corpora. To make your corpora you apply the cleaning functions to Amazon pros and cons reviews. Then you'll work on Google pros and cons.
5. Let's practice!