Search⌘ K
AI Features

Prefix Replacement

Explore how to use trie data structures to replace words in a sentence with the shortest valid prefixes from a given list. Learn the algorithm for building the trie, performing prefix searches, and applying this approach to modify sentences efficiently, enhancing your understanding of prefix search techniques.

Problem statement

We can append some other words to a prefix to frame new words. For example, the prefix re can be followed by words start and cap to form words restart or recap.

Given a list consisting of multiple prefixes and a sentence, replace all the words in the sentence with the prefix forming it. If a word can be replaced by more than one prefix, replace it with the prefix that has the shortest length.

Return the modified sentence after the replacement.

Example 1

Sample Input

prefixes: ["cat", "sat", "mat"]
sentence: "the cattle sat on the mattress"

Sample output

"the cat sat on the mat"

Explanation

For the sentence "the cattle sat on the mattress", the words "cattle", "sat" and "mattress" are replaced by their respective matching prefixes "cat", "sat" and "mat", while the other words remain the same.

Example 2

Sample Input

prefixes: ["we", "west", "week"]
sentence = "we love to watch western movies on weekend"

Sample output

"we love to watch we movies on we"

Explanation

For the sentence "we love to watch western movies on weekend", the word "we" is replaced by it’s shortest matching prefix "we". Similarly, the words "western" and "weekend" are also replaced by their shortest matching prefix "we" instead of the prefixes "west" and "week" respectively.

Try it yourself

Try to solve the problem yourself before reading the solution.

C++
Java
#include <iostream>
#include <vector>
using namespace std;
string replaceWords(vector<string> dictionary, string sentence) {
// your code goes here
return "";
}
Solve prefix replacement in C++

Intuition

For every word of the given sentence, we need to check if it has a valid prefix root present in the prefixes list. Using the brute-force approach would involve comparing each word of the sentence with each word of the prefixes list to check if there is a valid prefix root present for the given word. This procedure would incur a high time complexity.

The problem is related to strings and prefix search. Tries are an efficient data structure for prefix search–related problems. We can store the words of the prefixes list in a trie and leverage the prefix searching capability of finding valid prefix roots for all the words of the given sentence.

Algorithm

The summary of the algorithm is given below.

Step 1: Insertion of prefix words in a trie

Insert all the words of the prefixes list into a trie. Set the value of isEndOfWord as true for the trie node associated with the last character of the prefix word.

Step 2: Finding prefix roots and replacing sentence words

Iterate through every word of the given sentence and perform the following steps:

  • If any prefix of the extracted word is not present in the trie, we add the entire word as it is to the final answer string.

  • If while traversing the characters of the word in the trie, we encounter a node for which isEndOfWord is true, it implies we have found a prefix root for the given word. So we add the current prefix to the final answer string.

Example walkthrough

Let's try to understand this in a better way with the given example sentence below:  

“the cattle sat on the mattress”

  • Searching for the word the in trie, we can see that t is not present as a child of the root node, so we add the as it is to the final answer string.  

  • Searching for the word cattle in the trie, we can see that c is present as a child of the root node. While traversing the trie path, we're able to get cat as a prefix for the word cattle. So we add cat to the final answer string instead of the word cattle.

  • Following the same pattern, we replace cattle with cat, sat with sat, and mattress with mat, while the rest of the words in the sentence string remain the same. 

Visualization

The following picture explains the construction of the trie using the dictionary words:

Complexity analysis

Variables

  • Numbers of words in the sentence = SS.

  • Number of words in the prefixes list = PP.

  • The average length of a prefix = the average length of a word in the given sentence = WW.

Time complexity

Operation-wise time complexity

Operation

Time Complexity

Insertion of a prefix word in the trie

O(P)O(P)

Insertion of all the words of the prefixes list in the trie

O(PW)O(PW)

Searching for the prefix root of a given word in the trie

O(W)O(W)

The words of a sentence in the trie

O(SW)O(SW)

Explanation

The solution can be divided into two parts: insertion and searching in the trie.

  1. Inserting a single word in the trie takes time equal to the length of the word. In the worst case, no letter of the current word exists in the trie, and we need to create new trie nodes for all letters. This adds a time complexity of O(W)O(W). So the creation of a trie of words takes times proportional to the number of words in the prefixes list times the average length of the prefix word, which is O(P×W)O(P \times W).

  2. While searching for a prefix or a word in the trie, we traverse the trie downward node by node until either the word length or the trie path is exhausted, which adds a time complexity of O(W)O(W). The prefix search operation is performed for all the words of the given sentence. So, the total time complexity of searching the prefixes and making replacements in the sentence becomes proportional to the number of words in the sentence times the average length of the prefix, which is O(S×W)O(S \times W).

So, the total time complexity becomes O(PW+SW)O(PW+SW).

Space complexity

Operation-wise space complexity

Operation

Space Complexity

Insertion of a word in the trie

O(W)O(W)

Insertion of all words of the dictionary in the trie

O(PW)O(PW)

Storing the sentence reconstructed by replacing words with their prefix roots

O(SW)O(SW)

Explanation

The space complexity can be analyzed in two parts: the insertion of prefixes in the trie and reconstructing the output string.

  1. In the worst case, all the letters of all the prefix words can be different. This implies that there will be no common prefix paths, and therefore new trie nodes will be created for all the letters of the word. Inserting a string of length WW in the trie adds a space complexity of O(W)O(W). We insert all the PP strings of the prefixes list in the trie. This adds a space complexity of O(PW)O(PW).

  2. We construct an answer string where the words are replaced by their valid prefix roots. This operation adds a space complexity equal to the number of words in the sentence times the average length of the word, which is O(SW)O(SW).

While searching for a prefix in the trie, we traverse the trie downward until either the word length or the trie path is exhausted. No new trie nodes are created in this process, so it does not add space complexity.

So the total space complexity becomes O(PW+SW)O(PW+SW).