Get startedGet started for free

Welcome

1. Welcome

Welcome to the course "Intermediate Regular Expressions in R", my name is Angelo Zehr and I am a data journalist from Switzerland. In this course I will show you a set of tools and procedures that will improve your workflows with all kinds of textual input.

2. Where you might have left off

Maybe you have already taken the course "String Manipulation in R with stringr". That's perfect. We'll pick up what you've learned in that course and gradually learn to tackle more advanced topics. But don't worry if you haven't taken that course. I'll always give you a quick summary of all the concepts before we use them.

3. From Rebus to writing custom expressions

One package that we will leave behind, is the "rebus" package, that was excellently covered in the "stringr" course. Let's look at an example. Instead of constructing a regular expression using "start, special R operator, c", we will now write pattern as "caret c". This syntax might look a bit confusing at first, but don't let this scare you. You will quickly learn how to handle these special characters. Once you understand the key components of regular expressions, you'll never want to be without them ever again.

4. Prerequisites: stringr

In this chapter we'll be working with two functions from the stringr package. The first is "string detect". It will return true if the pattern was found in the string. As in all functions of the stringr package we pass the string as the first argument and the pattern as the second. The same goes for the second function we'll use: "string match". If the pattern matches, it will return the first occurrence of the pattern.

5. What regular expressions will help you achieve

I'm very sure you've all used Command plus "F" or Control plus "F" to search for certain things in your documents. In this example, I've searched "82%" in a text document.

6. What regular expressions will help you achieve

But what if I wanted to find not only "82%" but all numbers followed by "%". Regular expressions will help you achieve things like that and much more. They are a bit like Control F on steroids.

7. Our first dataset

In this chapter we'll use text input that contains information about movies. Let's say we look for a movie, but we forgot its name, but we remember it started with "K". With a regular expression we would like to filter down our list of movie titles to those that start with "K". Maybe you remember from the first slide that in regular expressions the caret "^" is used to search at the beginning of a string. So the pattern we use for our search is "caret K". This will give us a list of movies starting with that letter.

8. Special characters in regular expressions

Regular expressions consist to a large part of special characters. That makes them look pretty weird at first, but it also makes them very short and concise. The first four special characters we'll work with, are: The caret, which marks the beginning of a line or string. The dollar sign, that marks the end, the period which is like a joker, a wild card, that matches anything from letters to numbers and white spaces. And the backslash - when we want to search an actual period, like at the end of a sentence, we need to so called "escape" that period. In R we do that with two leading backslashes.

9. For example

Let's look at some examples of these characters. Let's for example combine the caret and the dot and apply this pattern to the word "Book". It will thus match the first thing at the beginning of the string. In this case this is the letter "B". The dot and the dollar sign will instead match the very last thing it finds. Here, that's the letter "k". When we would try to match an actual period by adding the two backslashes, we would match nothing, as there is no period in the word "Book". As a test, let's add a period at the end and try that pattern again. There you go. Now it finds that period.

10. Let's practice!

You think that's still all a bit cryptic? Well, then let's practice! By the end of this course, you'll master all these special characters, I promise.

Create Your Free Account

or

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.