Advanced Regular Expressions for Text Preprocessing

Learn about groupings, anchors, lookarounds, modifiers, and backreferences.

Groupings

We use groupings to create subpatterns within a larger pattern to match, capture, and backreference text. There are two types of groupings in regex: capturing groups that capture the matched subpatterns and save them for referencing later in the pattern or in a replacement string, and non-capturing groups that we use to group subpatterns without storing them for later use. We represent capturing groups using parentheses (), and non-capturing groups using (?:).

In the following code example, we’ll define regular expressions to search for a cat name and a dog name, find matches for colors in the text, and print the results of the capturing and non-capturing groups.

Get hands-on with 1200+ tech skills courses.