1. Learn
  2. /
  3. Courses
  4. /
  5. String Manipulation with stringr in R

Connected

Exercise

Character classes

In regular expressions a character class is a way of specifying "match one (and only one) of the following characters". In rebus you can specify the set of allowable characters using the function char_class().

This is another way you could specify an alternate spelling, for example, specifying "a gr followed by, either an a or e, followed by a y":

x <- c("grey sky", "gray elephant")
str_view(x, pattern = "gr" %R% char_class("ae") %R% "y")

A negated character class matches "any single character that isn't one of the following", and in rebus is specified with negated_char_class().

Unlike in other places in a regular expression you don't need to escape characters that might otherwise have a special meaning inside character classes. If you want to match . you can include . directly, e.g. char_class("."). Matching a - is a bit trickier. If you need to do it, just make sure it comes first in the character class.

Instructions 1/4

undefined XP
  • 1
    • Create a character class that contains vowels, a, e, i, o, u and their upper case versions.
    • Print vowels. In the regular expression language, a character class is put inside [.
    • View the matches to the pattern vowels in x with str_view(). Notice how only the first vowel is matched.
  • 2

    View the matches to the pattern vowels in x with str_view_all(). Now all matches are highlighted.

  • 3
    • Find the number of vowels in each boy_names by combining str_count() with the vowels pattern.
    • Find the number of characters in each boy_names with str_length().
  • 4
    • Find the average number of vowels in boy_names using the mean of num_vowels.
    • Look at the mean ratio of num_vowels and name_length.