Get startedGet started for free

Regex metacharacters

1. Regex metacharacters

Now, we will look at some other metacharacters that are very useful.

2. Looking for patterns

Let's first look at two methods of the re module: dot search and dot match. As you can see, they have the same syntax and are used to find a match. In the example, we use both methods to find a digit appearing four times. Both methods return an object with the match found. The difference is that dot match is anchored at the beginning of the string. In the second example, we used them to find a match for a digit. In this case, dot search finds a match, but dot match does not. This is because the first characters do not match the regex.

3. Special characters

The dot metacharacter matches any character. In the example code, we need to match links in the string. We know a link starts with w, w, w and ends with c, o, m. So we first write this in our regex.

4. Special characters

We don't know how many character are in between. So we indicate that we want any character, a dot, once or more times, adding the plus. We can see in the output that we get our match.

5. Special characters

The circumflex anchors the regex to the start of a string. In the example, we find the pattern starting with t, followed by h, e, whitespace, two digits and ending with s. The method finds the following two matches. Now, we add the anchor metacharacter. We receive only one match. The one that appears at the beginning of the string.

6. Special characters

On the contrary, the dollar sign anchors the regex to the end of the string. If we use it in the previous example, we get the match that appears at the end of the string.

7. Special characters

What if I want to use characters like dollar sign or dot, which also have other meanings? Let's look at an example. We want to split the string by dot whitespace. We write the following regex. And we get an output that it is not what we expected. Why? Because the regex interprets the dot as any character. To solve this situation, we need to escape the character by adding a backslash in front of the dot. Now we get the correct output.

8. OR operator

In the example code, we want to match the word elephant. However, we see that it's written with capital E or lower e. In that case, we use the vertical bar. In this way, we indicate that we want to match one variant OR the other obtaining both elephant-matches.

9. OR operator

Square brackets also represent the OR operand. Inside them, we can specify optional characters to match. Look at the example. We want to find a pattern that contains lowercase or uppercase letter followed by a digit. To do so, we can use the square brackets. Inside them, we will use lowercase a dash lowercase z to specify any lowercase letter. Then, uppercase a dash uppercase z to indicate any uppercase letter. Then the plus. Then backslash d. Thus, we get the following matches.

10. OR operator

In the following string, we want to replace the non-word characters by whitespace. We specify optional characters inside the square brackets. The engine searches for one or the other. When it finds a match, it replaces the character by a whitespace, getting the following output.

11. OR operand

The circumflex transforms the expression inside square brackets into negative. In the example, we add the circumflex to specify we want the links that do not contain any number. And we get the following output.

12. Let's practice!

It's your turn to practice metacharacters!