Get startedGet started for free

Capturing groups

1. Grouping and capturing

In the last stop of our journey, we are going to talk about some advanced concepts of regex. More specifically, in this video, we'll talk about capturing groups.

2. Group characters

Let's say that we have the following text.

3. Group characters

And we want to extract information about a person, how many and which type of relationships they have. So, we want to extract Clary 2 friends, Susan 3 brothers and John 4 sisters, as you can see in the slide. We know the structure of the sentences. Let's try our first approach. We would write something like in the code, any upper or lowercase letter, whitespace, any word character, whitespace, a number, whitespace and any word character. Let's see the output. Quite close. But we don't want the word has.

4. Capturing groups

What can we do about this? We start simple by trying to extract only the names. We can place parentheses to group those characters as shown in the slide. Capture them. And retrieve only that group.

5. Capturing groups

In the code, we have now added parentheses to group our first part of the regex. We can observe in the output that the group was captured. And only the three names were retrieved.

6. Capturing groups

Let's look at the example again. We can place parentheses around the three groups that we want to capture as shown in the slide. Each group will receive a number. The entire expression will always be group zero. The first group one, the second two, and the third number three. We'll see how to use these numbers later.

7. Capturing groups

Let's see this in the code example. We add the parentheses to group together each of the three parts of the regex. In the output, we got a list of tuples. The first element of each tuple is the match captured corresponding to group one. The second to group two. The last to group three.

8. Capturing groups

As we already discussed, we can use capturing groups to match a specific subpattern in a pattern. We can use this information for retrieving the groups by numbers as we'll learn later in the next videos.

9. Capturing groups

But we can also use it to organize data. As you saw earlier, the matches are retrieved as lists. In the code, we placed the parentheses to capture the name of the owner, the number and which type of pets each one has. We can access the information retrieved by using indexing and slicing as seen in the code. You've learned this in the first videos.

10. Capturing groups

But capturing groups have one important feature. Remember that quantifiers apply to the character immediately to the left. So, we can place parentheses to group characters and then apply the quantifier to the entire group. In the code, we have placed parentheses to match the group containing a number and any letter. We applied the plus quantifier to specify that we want this group repeated once or more times. And we get the following match shown in the output.

11. Capturing groups

But be careful. It's not the same to capture a repeated group than to repeat a capturing group. In the first code, we use findall to match a capturing group containing one number. We want this capturing group to be repeated once or more times. We get 5 and 3 as an output. Because these numbers are repeated consecutively once or more times. In the second code, we specify that we should capture a group containing one or more repetitions of a number. We now get the following output.

12. Let's practice!

Let's go ahead and practice capturing groups.