What is NLP sequencing?
Natural Language Processing sequencing takes a sequence of words and converts them into a sequence of numbers. We can perform other data processing techniques after sequencing the text.
Example
The following example explains how NLP sequencing works:
input_text = ['This is Educative','We love Educative','This is an Educative Answer']word_index: {'Educative': 1, 'this': 2, 'is': 3, 'we': 4,'love': 5, 'an': 6, 'Answer': 7}sequences: [[2, 3, 1], [4, 5, 1], [2, 3, 6, 1, 7]]
An integer is assigned to each unique word according to its frequency in the input text, and integer sequences are generated accordingly from the input word sequences.
How to implement NLP sequencing
We can do the sequencing by using the Tokenizer library from TensorFlow. The following code demonstrates NLP sequencing:
import osos.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'import tensorflow as tffrom tensorflow import kerasfrom tensorflow.keras.preprocessing.text import Tokenizerinput_text = ['I love Educative','I am reading an Educative Answer','Educative Answer on NLP sequencing']tokenizer = Tokenizer(num_words = 50)tokenizer.fit_on_texts(input_text)word_index = tokenizer.word_indexsequences = tokenizer.texts_to_sequences(input_text)print(word_index)print('\n')print(sequences)
Explanation
Line 1–2: We'll set up the environment to ignore
tfwarnings.Line 4–6: We'll import the necessary libraries.
Line 13: We are defining a
Tokenizerobject and mapping it to the sentences we created.Line 14: The
fit_on_text()function updates the internal vocabulary based on the list of texts.Line 16: We are defining our word index by using the
word_indexfunction.Line 17: The input list
sentencesis being converted into an integer sequence by using thetext_to_sequences()function.
Hence, NLP sequencing is used for machine translation or
Free Resources