ComenzarEmpieza gratis

Using the "or pattern" with a larger dataset

Now that you've understood the principle of concatenating multiple possibilities from a vector, you'll go one step further and apply this to a larger dataset. Available in the global scope are two variables: articles and politicians. The first is a collection of news articles about Swiss politics. The latter is a list of names of Swiss politicians that appear in the articles.

Now it's your job to find out which names appear in which of the articles and which politician appears how many times in all the articles.

Este ejercicio forma parte del curso

Intermediate Regular Expressions in R

Ver curso

Instrucciones del ejercicio

  • Use the vector politicians to create a regular expression that matches all the names that are stored in that vector.
  • Create a new column in the data frame articles which contains all politician names that appear in the column text.
  • Glue all articles together so you're able to count the number of occurrences per politician more easily.
  • Use the vector politicians as a pattern and pass it to str_count().

Ejercicio interactivo práctico

Prueba este ejercicio completando el código de muestra.

# Construct a pattern that searches for all politicians
polit_pattern <- glue_collapse(___, sep = "___")

# Use the pattern to match all names in the column "text"
articles %<>%
  mutate(mentions = str_match_all(___, ___))

# Collapse all items of the column "text"
all_articles_in_one <- ___(articles$text)

# Pass the vector politicians to count all its elements
str_count(all_articles_in_one, ___)
Editar y ejecutar código