Trusted answers to developer questions
Trusted Answers to Developer Questions

Related Tags

nltk

What is the nltk.stem.api module?

Sarvech Qadir

Grokking Modern System Design Interview for Engineers & Managers

Ace your System Design Interview and take your career to the next level. Learn to handle the design of applications like Netflix, Quora, Facebook, Uber, and many more in a 45-min interview. Learn the RESHADED framework for architecting web-scale applications by determining requirements, constraints, and assumptions before diving into a step-by-step design process.

Answers Code

Text normalization (i.e., preparing text, words, and documents) is one of the most fundamental tasks of the Natural Language processing field. These text normalization techniques are called Stemming and Lemmatization. nltk.stem is one of the most widely used libraries in Python for Stemming and Lemmatization.

Examples of Stemming and Lemmatization:

Now, the words cars, car’s, CAR, Car, and cars’ are all derived from the rootor stem word car. With nltk.stem, all of these words will be mapped to car.

widget

nltk.stem.api module

class nltk.stem.api.StemmerI

Bases: object

The class above is used to perform the process of stemming from words.

@abstractmethod 
stem(token) 
// abstract method of this classes 

The method above is used to strip the affixes from the passed parameter token and return the stem of the token.

Parameters token: string – The token refers to the string passed as a parameter that should be stemmed.

Code

In the English language, either PorterStammer or LancasterStammer can be used (both of which are widely used in the stemming algorithm). The key differences between them are:

  1. LancasterStammer is more aggressive in approach than PorterStammer.

  2. PorterStammer is computationally more intensive.

  3. LancasterStammer is faster and reduces computational time when dealing with datasets.

// importing libraries

from nltk.stem import PorterStemmer
from nltk.stem import LancasterStemmer
// instance of PorterStemmer
>> porter = PorterStemmer()
>> porter.stem("cats")
cat
>> porter.stem("trouble")
troubl


// instance of LancasterStemmer
>> lancaster=LancasterStemmer()
>> lancaster.stem("cats")
cat
>> lancaster.stem("trouble")
troubl

RELATED TAGS

nltk

CONTRIBUTOR

Sarvech Qadir
Copyright ©2022 Educative, Inc. All rights reserved

Grokking Modern System Design Interview for Engineers & Managers

Ace your System Design Interview and take your career to the next level. Learn to handle the design of applications like Netflix, Quora, Facebook, Uber, and many more in a 45-min interview. Learn the RESHADED framework for architecting web-scale applications by determining requirements, constraints, and assumptions before diving into a step-by-step design process.

Answers Code
Keep Exploring