Search⌘ K
AI Features

Solution: Rule-Based Matchmaking

Explore how to use spaCy's rule-based Matcher to define token patterns and extract matched entities from text. Understand the steps to load models, create patterns, run matches, and retrieve matched spans, enhancing your ability to apply rule-based approaches in natural language processing.

We'll cover the following...

Solution

The solution to the previous exercise is given below:

Python 3.5
import spacy
from spacy.matcher import Matcher
nlp = spacy.load("en_core_web_md")
pattern = [{"IS_DIGIT": True, "LENGTH": 4}, {"TEXT": "-"}, {"IS_DIGIT": True, "LENGTH": 2}, {"TEXT": "-"}, {"IS_DIGIT": True, "LENGTH": 2}]
# Create a Spacy Matcher object and add the pattern to it
matcher = Matcher(nlp.vocab)
matcher.add("DATE_PATTERN", [pattern])
doc = nlp("I have a meeting on 2022-02-15 and another meeting on 2022-02-20")
# Match the defined pattern against the processed document and print out the matched spans
matches = matcher(doc)
for match_id, start, end in matches:
matched_span = doc[start:end]
print(matched_span.text)

Explanation

Lines 1 and 2: We import the necessary libraries, including spaCy and the Matcher class.

Line 4: ...