Session Ready
Exercise

Preparing the dtm

You are given a dataframe df containing 90 abstracts of NSF awards. Its three columns are Abstract, AwardNumber, and field. Your task is to construct a document-term matrix, with stop words being filtered out. Use AwardNumber as the document ID.

Instructions
100 XP
  • Split the Abstract column into tokens.
  • Remove stopwords.
  • Count the number of occurrences.
  • Create a document term matrix.