...
/Feature #2: Design Search Autocomplete System
Feature #2: Design Search Autocomplete System
Implementing the "Design Search Autocomplete System" feature for our "Search Engine" project.
We'll cover the following...
Solution
To design this system, we will again use the trie data structure. Instead of simply storing the words in the prefix tree, as we did in the WordDictionary, we will now store the query strings. The AutocompleteSystem class will act as a trie that keeps a record of the previous queries and assigns them a rank based on their number of occurrences.
We will implement the AutocompleteSystem class as follows:
-
Constructor: In the constructor, we will feed the historical data into the system and create a trie out of it. We will initialize the
rootnode of trie and call theaddRecord()function to add all the records. -
addRecord() function: This function inserts records in the trie by creating new nodes. Its functionality is similar to theinsertWord()function that we discussed in Feature #1: Store and Fetch Words. Each node of the trie will have:- A
childrendictionary - A Boolean called
isEndto set the end of the query sentence - A new variable called
datathat is optional, but we can use it to store the whole query sentence in the last character of the sentence. This will increase space complexity but can make computation easier. - A
rankvariable to store the number of occurrences
In the code below, you will notice that we are storing the rank as a negative value. There is a valid reason for doing this unintuitive step. Later, you will see that we will be using the
sorted()method to obtain the top three results, and this negative rank will play a significant role. This will be explained in more detail in the explanation for theautoComplete()function. - A
-
search() function: This function checks to see if the first character exists in itschildrenbeginning with the root node. If it exists, we move on to the node with the first character and check itschildrenfor the next character. If the node corresponding to a character is not found, we return[]. If the search string is found, thedfs()helper function is called on the node with the last character of the input query. -
dfs()function: This function is the helper function that returns all the possible queries in the record that match the input. First, it will check if the node hasisEndset toTrue; if it does, the node’srankanddatawill be appended as a tuple in an output listret. Then, DFS exploration will be performed on all children nodes. All the possible queries will be added inretand returned at the end. -
autoComplete()function: This function checks if the input string is not the end of string delimiter"#". If it is not the end of the input string, we append the value to thekeywordmember variable. Then, we call thesearch()function, which returns the list of tuples as discussed above. On the other hand, if the input string is the end of the input, we add the value present inkeywordto the trie by callingaddRecord(). Next, the value of thekeywordvariable is reset to an empty string. Before returning, you can see that we do some computation on theresultlist. The list that we received from thesearch()function was a list of tuples withrankandsentenceas elements. ...