Search⌘ K

Solution: Core Operations with spaCy

Explore how to use spaCy for essential NLP tasks by learning to tokenize text, extract lemmas, and segment sentences through custom Python functions that process and analyze language data effectively.

We'll cover the following...

Solution

The solution to the previous exercise is given below:

Python 3.5
import spacy
nlp = spacy.load("en_core_web_md")
def spacy_tokenizer(text):
doc = nlp(text)
tokens = [{"text": token.text, "lemma": token.lemma_,} for token in doc]
return tokens
def spacy_sentencer(text):
doc = nlp(text)
sentences = [sent.text for sent in doc.sents]
return sentences
def spacy_analyzer(text):
tokens = spacy_tokenizer(text)
sentences = spacy_sentencer(text)
return {"tokens": tokens, "sentences": sentences}
text = "I went for working in Europe. I worked for 3 years in a software company."
result = spacy_analyzer(text)
print("Tokens:", result["tokens"])
print("Sentences:", result["sentences"])

Solution explaination

Let's take a look at this solution:

  • Lines 1 and 2: We import the spacy library and load an English model using spacy.load. ...