Search⌘ K
AI Features

Python Lookahead

Explore how to apply positive and negative lookahead assertions in Python regular expressions. Understand how lookahead matches a position without consuming characters, enabling you to find patterns based on what follows specific text. This lesson teaches you practical techniques for precise pattern matching useful in data scraping and text analysis projects.

We'll cover the following...

Positive Lookahead

Python positive lookahead matches at a position where the pattern inside the lookahead can be matched. Matches only the position. It does not consume any characters or expand the match.

Example

Consider the following string:

begin:learner1:scientific:learner2:scientific:learner3:end

Positive lookahead assertion can help us to find all words followed by the word scientific.

Python
import re
string = "begin:learner1:scientific:learner2:scientific:learner3:end"
print re.findall(r"(\w+)(?=:scientific)", string)

Note the output learner1 and learner2, but not learner3, which is followed by the word :end.

Neagative Lookahead

Similar to positive lookahead, except that negative lookahead only succeeds if the regex inside the lookahead fails to match.

Example

Let’s now proceed to an example, where we find the word (learner3) followed by end.

Python
import re
string = "begin:learner1:scientific:learner2:scientific:learner3:end"
print re.findall(r"(learner\d+)(?!:scientific)", string)

This matched all the words, not followed by the word scientific!