Backreferences
1. Backreferences
We will now discuss how we can backreference capturing groups.2. Numbered groups
Imagine we come across this text. And we want to extract the date highlighted. But we want to extract only the numbers. So, we can place parentheses in a regex to capture these groups as we learned.3. Numbered groups
We have also seen that each of these groups receive a number. The whole expression is group zero. The first group one, and so on, as shown in the slide.4. Numbered groups
Let's use dot search to match the pattern to the text. To retrieve the groups captured, we can use the method dot group specifying the number of a group we want. For example, three. The method retrieves the match corresponding to group number three as shown in the output. We can also retrieve group zero. It will output the entire expression. Dot group can only be used with dot search and dot match methods.5. Named groups
We can also give names to our capturing groups. Inside the parentheses, we write question mark capital p, and the name inside angle brackets as shown in the slide.6. Named groups
Let's say we have the following string. We want to match the name of the city and zipcode in different groups. We can use capturing groups and assign them the name city and zipcode as shown in the code. We retrieve the information by using dot group. We indicate the name of the group. For example, specifying city gives us the output Austin. Specifying zipcode gives us the number match as shown.7. Backreferences
There is another way to backreference groups. In fact, the matched group can be reused inside the same regex or outside for substitution. We can do this using backslash and the number of the group as you can see in the slide.8. Backreferences
Let's see an example. We have the following string. We want to find all matches of repeated words. In the code, we specify that we want to capture a sequence of word characters. Then a whitespace.9. Backreferences
Then we write backslash one. This will indicate that we want to match the first group captured again. In other words, it says match that sequence of characters that was previously captured once more. And we get the word happy as an output. This was the repeated word in our string.10. Backreferences
Now, we will replace the repeated word with one occurrence of the same word. In the code, we use the same regex as before. This time, we use the dot sub method. In the replacement part, we can also reference back to the captured group. We write r backslash one inside quotes. This says replace the entire expression match with the first captured group. In the output string, we have only one occurrence of the word happy.11. Backreferences
We can also use named groups for backreferencing. To do this, we use question mark capital p equal sign and the group name. In the code, we want to find all matches of the same number. We use a capturing group and name it code. Later, we reference back to this group. And we obtain the number as an output.12. Backreferences
On the other hand, to reference the group back for replacement we need to use backslash g and the group name inside angle brackets. In the code, we want to replace repeated words by one occurrence of the same word. Inside the regex, we use the previous syntax. In the replacement field, we need to use this new syntax as seen in the code to get the following output.13. Let's practice!
Let's go ahead and practice how to backreference groups!Create Your Free Account
or
By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.