Python regex `match` function

Python Regex match() function explained with examples.

The Match Function

The match function attempts to match a re pattern to string with optional flags.

Here is the syntax for this function −

re.match(pattern, string, flags=0)

Where, pattern is the regular expression to be matched, string is the string to be searched to match the pattern at the beginning of string and flags, which you can specify different flags using bitwise OR (|).

Match Flags

Modifier Description
re.I Performs case-insensitive matching.
re.L Interprets words according to the current locale. This interpretation affects the alphabetic group (\w and \W), as well as word boundary behavior (\b and \B).
re.M Makes $ match the end of a line and makes ^ match the start of any line.
re.S Makes a period (dot) match any character, including a newline.
re.U Interprets letters according to the Unicode character set. This flag affects the behavior of \w, \W, \b, \B.
re.X It ignores whitespace (except inside a set [] or when escaped by a backslash and treats unescaped # as a comment marker.

Return values

  • The re.match function returns a match object on success and None upon failure. -
  • Use group(n) or groups() function of match object to get matched expression, e.g., group(n=0) returns entire match (or specific subgroup n)
  • The function groups() returns all matching subgroups in a tuple (empty if there weren’t any).

Example 1

Let’s find the words before and after the word to:

#!/usr/bin/python
import re
line = "Learn to Analyze Data with Scientific Python";
m = re.match( r'(.*) to (.*?) .*', line, re.M|re.I)
if m:
print "m.group() : ", m.group()
print "m.group(1) : ", m.group(1)
print "m.group(2) : ", m.group(2)
else:
print "No match!!"

The first group (.*) identified the string: Learn and the next group (*.?) identified the string: Analyze.

Example 2

groups([default]) returns a tuple containing all the subgroups of the match, from 1 up to however many groups are in the pattern.

#!/usr/bin/python
import re
line = "Learn Data, Python";
m = re.match( r'(\w+) (\w+)', line, re.M|re.I)
if m:
print "m.group() : ", m.groups()
print "m.group (1,2)", m.group(1, 2)
else:
print "No match!!"

Example 3

groupdict([default]) returns a dictionary containing all the named subgroups of the match, keyed by the subgroup name.

#!/usr/bin/python
import re
number = "124.13";
m = re.match( r'(?P<Expotent>\d+)\.(?P<Fraction>\d+)', number)
if m:
print "m.groupdict() : ", m.groupdict()
else:
print "No match!!"

Example 4

Start, end. How can we match the start or end of a string? We can use the “A” and “Z” metacharacters. We precede them with a backslash. We match strings that start with a certain letter, and those that end with another.

import re
values = ["Learn", "Live", "Python"];
for value in values:
# Match the start of a string.
result = re.match("\AL.+", value)
if result:
print("START MATCH [L]:", value)
# Match the end of a string.
result2 = re.match(".+n\Z", value)
if result2:
print("END MATCH [n]:", value)

Example 5

start([group]) and end([group]) return the indices of the start and end of the substring matched by group. See the next lesson for an example.