What is part-of-speech (PoS) tagging?

Part-of-speech (PoS) tagging is the process of labeling words in a text according to their word types, such as nouns, adjectives, adverbs, verbs, prepositions, conjunctions, pronouns, interjections, etc.

How it works

Let's try to understand how PoS tagging works through this example:

PoS tagging example

In this example, "I" is labeled as a personal pronoun (PRP), "work" is labeled as a third-person singular present verb (VBP), "at" as a preposition (IN), and "Educative" as a singular noun (NN).

Implementation

PoS tagging can be implemented by using the nltk library. We need to follow these steps to implement POS tagging:

Step 1

We first need to import the relevant libraries. We can do this using the following code snippet:

import nltk
from nltk import word_tokenize

Step 2

Next, we give the text that needs to be labeled as the input, and tokenize it. The word_tokenize() function in nltk tokenizes the text into separate words. We can do this using the following code snippet:

text = "I love reading Educative Answers."
tokens = nltk.word_tokenize(text)

Step 3

In this step, we label the words with tags. This can be done by using the pos_tag() function. The following snippet demonstrates this step:

print("Parts of Speech: ",nltk.pos_tag(tokens))

After this step, a list consisting of the tokenized words and their tags is printed, as follows:

Parts of Speech:  [('I', 'PRP'), ('love', 'VB'), 
('reading', 'VB'), ('Educative', 'NN'), ('Answers', 'NNS')]

Uses of PoS tagging

PoS tagging finds its uses in the following domains:

Named entity recognition (NER)
Sentiment analysis
Word-sense disambiguation
Question answering

Hence, PoS tagging is an integral part of NLP and is vital to differentiate between the two meanings of a word.

Relevant Answers

Explore Courses

Free Resources