What is the nltk.stem.api module?

Text normalization (i.e., preparing text, words, and documents) is one of the most fundamental tasks of the Natural Language processing field. These text normalization techniques are called Stemming and Lemmatization. nltk.stem is one of the most widely used libraries in Python for Stemming and Lemmatization.

Examples of Stemming and Lemmatization:

Now, the words cars, car’s, CAR, Car, and cars’ are all derived from the rootor stem word car. With nltk.stem, all of these words will be mapped to car.

nltk.stem.api module

class nltk.stem.api.StemmerI

Bases: object

The class above is used to perform the process of stemming from words.

@abstractmethod 
stem(token) 
// abstract method of this classes

The method above is used to strip the affixes from the passed parameter token and return the stem of the token.

Parameters token: string – The token refers to the string passed as a parameter that should be stemmed.

Code

In the English language, either PorterStammer or LancasterStammer can be used (both of which are widely used in the stemming algorithm). The key differences between them are:

LancasterStammer is more aggressive in approach than PorterStammer.
PorterStammer is computationally more intensive.
LancasterStammer is faster and reduces computational time when dealing with datasets.

// importing libraries

from nltk.stem import PorterStemmer
from nltk.stem import LancasterStemmer

// instance of PorterStemmer
>> porter = PorterStemmer()
>> porter.stem("cats")
cat
>> porter.stem("trouble")
troubl


// instance of LancasterStemmer
>> lancaster=LancasterStemmer()
>> lancaster.stem("cats")
cat
>> lancaster.stem("trouble")
troubl

Free Resources