Get startedGet started for free

Step 3: Text organization

1. Step 3: Text organization

In step 3, you organize your text which is when you first clean it.

2. Text organization with qdap

In this chapter we separate qdap functions into a custom qdap_clean function. qdap functions can be applied directly to a text vector, rather than a corpus object. In the qdap_clean function, x is a vector of employee reviews and the first preprocessing step uses replace_abbreviation, then replace_contraction, and so on.

3. Text organization with tm

For the tm library, we have a slightly more familiar cleaning function tm_clean. This function takes a VCorpus to first removePunctuation, stripWhitespace and finally removeWords, including common English and custom terms "Google", "amazon", and "company".

4. Cleaning your corpora

In this section you will clean 4 distinct corpora. To make your corpora you apply the cleaning functions to Amazon pros and cons reviews. Then you'll work on Google pros and cons.

5. Let's practice!

Create Your Free Account

or

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.