Prefix Replacement
Explore how to use trie data structures to replace words in a sentence with the shortest valid prefixes from a given list. Learn the algorithm for building the trie, performing prefix searches, and applying this approach to modify sentences efficiently, enhancing your understanding of prefix search techniques.
Problem statement
We can append some other words to a prefix to frame new words. For example, the prefix re can be followed by words start and cap to form words restart or recap.
Given a list consisting of multiple prefixes and a sentence, replace all the words in the sentence with the prefix forming it. If a word can be replaced by more than one prefix, replace it with the prefix that has the shortest length.
Return the modified sentence after the replacement.
Example 1
Sample Input
prefixes: ["cat", "sat", "mat"]sentence: "the cattle sat on the mattress"
Sample output
"the cat sat on the mat"
Explanation
For the sentence "the cattle sat on the mattress", the words "cattle", "sat" and "mattress" are replaced by their respective matching prefixes "cat", "sat" and "mat", while the other words remain the same.
Example 2
Sample Input
prefixes: ["we", "west", "week"]sentence = "we love to watch western movies on weekend"
Sample output
"we love to watch we movies on we"
Explanation
For the sentence "we love to watch western movies on weekend", the word "we" is replaced by it’s shortest matching prefix "we". Similarly, the words "western" and "weekend" are also replaced by their shortest matching prefix "we" instead of the prefixes "west" and "week" respectively.
Try it yourself
Try to solve the problem yourself before reading the solution.
Intuition
For every word of the given sentence, we need to check if it has a valid prefix root present in the prefixes list. Using the brute-force approach would involve comparing each word of the sentence with each word of the prefixes list to check if there is a valid prefix root present for the given word. This procedure would incur a high time complexity.
The problem is related to strings and prefix search. Tries are an efficient data structure for prefix search–related problems. We can store the words of the prefixes list in a trie and leverage the prefix searching capability of finding valid prefix roots for all the words of the given sentence.
Algorithm
The summary of the algorithm is given below.
Step 1: Insertion of prefix words in a trie
Insert all the words of the prefixes list into a trie. Set the value of isEndOfWord as true for the trie node associated with the last character of the prefix word.
Step 2: Finding prefix roots and replacing sentence words
Iterate through every word of the given sentence and perform the following steps:
If any prefix of the extracted word is not present in the trie, we add the entire word as it is to the final answer string.
If while traversing the characters of the word in the trie, we encounter a node for which
isEndOfWordistrue, it implies we have found a prefix root for the given word. So we add the current prefix to the final answer string.
Example walkthrough
Let's try to understand this in a better way with the given example sentence below:
“the cattle sat on the mattress”
Searching for the word
thein trie, we can see thattis not present as a child of the root node, so we addtheas it is to the final answer string.Searching for the word
cattlein the trie, we can see thatcis present as a child of the root node. While traversing the trie path, we're able to getcatas a prefix for the wordcattle. So we addcatto the final answer string instead of the wordcattle.Following the same pattern, we replace
cattlewithcat,satwithsat, andmattresswithmat, while the rest of the words in the sentence string remain the same.
Visualization
The following picture explains the construction of the trie using the dictionary words:
Complexity analysis
Variables
Numbers of words in the
sentence=. Number of words in the
prefixeslist =. The average length of a prefix = the average length of a word in the given sentence =
.
Time complexity
Operation-wise time complexity
Operation | Time Complexity |
Insertion of a prefix word in the trie | |
Insertion of all the words of the prefixes list in the trie | |
Searching for the prefix root of a given word in the trie | |
The words of a sentence in the trie |
Explanation
The solution can be divided into two parts: insertion and searching in the trie.
Inserting a single word in the trie takes time equal to the length of the word. In the worst case, no letter of the current word exists in the trie, and we need to create new trie nodes for all letters. This adds a time complexity of
. So the creation of a trie of words takes times proportional to the number of words in the prefixeslist times the average length of the prefix word, which is. While searching for a prefix or a word in the trie, we traverse the trie downward node by node until either the word length or the trie path is exhausted, which adds a time complexity of
. The prefix search operation is performed for all the words of the given sentence. So, the total time complexity of searching the prefixes and making replacements in the sentence becomes proportional to the number of words in thesentencetimes the average length of the prefix, which is.
So, the total time complexity becomes
Space complexity
Operation-wise space complexity
Operation | Space Complexity |
Insertion of a word in the trie | |
Insertion of all words of the dictionary in the trie | |
Storing the sentence reconstructed by replacing words with their prefix roots |
Explanation
The space complexity can be analyzed in two parts: the insertion of prefixes in the trie and reconstructing the output string.
In the worst case, all the letters of all the prefix words can be different. This implies that there will be no common prefix paths, and therefore new trie nodes will be created for all the letters of the word. Inserting a string of length
in the trie adds a space complexity of . We insert all the strings of the prefixeslist in the trie. This adds a space complexity of. We construct an answer string where the words are replaced by their valid prefix roots. This operation adds a space complexity equal to the number of words in the
sentencetimes the average length of the word, which is.
While searching for a prefix in the trie, we traverse the trie downward until either the word length or the trie path is exhausted. No new trie nodes are created in this process, so it does not add space complexity.
So the total space complexity becomes