Tap here to switch tabs

Problem

Ask

Submissions

Solution

Home

Courses

Grokking the Coding Interview Patterns

Solution: Alien Dictionary

Statement▼

Constraints:

$1 \leq$ words.length $\leq 10^3$
$1 \leq$ words[i].length $\leq 20$
All characters in words[i] are English lowercase letters.

Naive approach

The naive approach is to generate all possible orders of alphabets in the alien language and then iterate over them, character by character, to select the ones that satisfy the dictionary dependencies. So, there’d be O(u!) permutations, where $u$ is the number of unique alphabets in the alien language, and for each permutation, we’d have to check if it’s a valid partial order. That requires comparing against the dictionary words repeatedly.

This is very expensive since there are an exponential number of possible orders ( ${u!}$ ) and only a handful of valid ones. On top of that, there’d be additional effort to compare them against the dictionary. The time complexity for this approach is $O(u!)$ . The space complexity is $O(1).$

Optimized approach using topological sort

We can solve this problem using the topological sort pattern. Topological sort is used to find a linear ordering of elements that have dependencies on or priority over each other. For example, if $A$ is dependent on $B$ or $B$ has priority over $A$ , then $B$ is listed before $A$ in topological order.

Using the list of words, we identify the relative precedence order of the letters in the words and generate a graph to represent this ordering. To traverse a graph, we can use breadth-first search to find the letters’ order.

We can essentially map this problem to a graph problem, but before exploring the exact details of the solution, there are a few things that we need to keep in mind:

The letters within a word don’t tell us anything about the relative order. For example, the word “educative” in the list doesn’t tell us that the letter “e” is before the letter “d.”
The input can contain words followed by their prefix, such as “educated” and then “educate.” These cases will never result in a valid alphabet because in a valid alphabet, prefixes are always first. We need to make sure our solution detects these cases correctly.
There can be more than one valid alphabet ordering. It’s fine for our algorithm to return any one of them.
The output dictionary must contain all unique letters within the words list, including those that could be in any position within the ordering. It shouldn’t contain any additional letters that weren’t in the input.

Note: In the following section, we will gradually build the solution. Alternatively, you can skip straight to just the code.

For the graph problem, we can break this particular problem into three parts:

Extract the necessary information to identify the dependency rules from the words. For example, in the words [“patterns”, “interview”], the letter “p” comes before “i.”
With the gathered information, we can put these dependency rules into a directed graph with the letters as nodes and the dependencies (order) as the edges.
Lastly, we can sort the graph nodes topologically to generate the letter ordering (dictionary).

Let’s look at each part in more depth.

Part 1: Identifying the dependencies

Let’s start with example words and observe the initial ordering through simple reasoning:

["mzosr", "mqov", "xxsvq", "xazv", "xazau", "xaqu", "suvzu", "suvxq", "suam", "suax", "rom", "rwx", "rwv"]

As in the English language dictionary, where all the words starting with “a” come at the start followed by the words starting with “b,” “c,” “d,” and so on, we can expect the first letters of each word to be in alphabetical order.

["m", "m", "x", "x", "x", "x", "s", "s", "s", "s", "r", "r", "r"]

Removing the duplicates, we get the following:

["m", "x", "s", "r"]

Following the intuition explained above, we can assume that the first letters in the messages are in alphabetical order:

Note: Notice that we didn’t mention rules such as “m -> a”. This is fine because we can derive this relation from “m -> x”, “x -> a”.

This is it for the first part. Let’s put the pieces that we have in place.

Part 2: Representing the dependencies

We now have a set of relations mentioning the relative order of the pairs of letters:

["z -> q", "m -> x", "x -> a", "x -> v", "x -> s", "z -> x", "v -> a", "s -> r", "o -> w"]

Now the question arises, how can we put these relations together? It might be tempting to start chaining all these together. Let’s look at a few possible chains:

Part 3: Generating the dictionary

As we can see from the graph, four of the letters have no incoming arrows. This means that there are no letters that have to come before any of these four.

Remember: There could be multiple valid dictionaries, and if there are, then it’s fine for us to return any of them.

Therefore, a valid start to the ordering we return would be as follows:

["o", "m", "u", "z"]

We can now remove these letters and edges from the graph because any other letters that required them first will now have this requirement satisfied.

We can place the final two letters in our output list and return the ordering:

 ["o", "m", "u", "z", "x", "q", "w", "v", "s", "a", "r"]

Let’s now review how we can implement this approach.

Identifying the dependencies and representing them in the form of a graph is pretty straightforward. We extract the relations and insert them into an adjacency list:

⋮

Tap here to switch tabs

Problem

Ask

Submissions

Solution

Home

Courses

Grokking the Coding Interview Patterns

Solution: Alien Dictionary

Statement▼

Constraints:

$1 \leq$ words.length $\leq 10^3$
$1 \leq$ words[i].length $\leq 20$
All characters in words[i] are English lowercase letters.

Naive approach

Optimized approach using topological sort

We can essentially map this problem to a graph problem, but before exploring the exact details of the solution, there are a few things that we need to keep in mind:

The letters within a word don’t tell us anything about the relative order. For example, the word “educative” in the list doesn’t tell us that the letter “e” is before the letter “d.”
The input can contain words followed by their prefix, such as “educated” and then “educate.” These cases will never result in a valid alphabet because in a valid alphabet, prefixes are always first. We need to make sure our solution detects these cases correctly.
There can be more than one valid alphabet ordering. It’s fine for our algorithm to return any one of them.
The output dictionary must contain all unique letters within the words list, including those that could be in any position within the ordering. It shouldn’t contain any additional letters that weren’t in the input.

Note: In the following section, we will gradually build the solution. Alternatively, you can skip straight to just the code.

For the graph problem, we can break this particular problem into three parts:

Extract the necessary information to identify the dependency rules from the words. For example, in the words [“patterns”, “interview”], the letter “p” comes before “i.”
With the gathered information, we can put these dependency rules into a directed graph with the letters as nodes and the dependencies (order) as the edges.
Lastly, we can sort the graph nodes topologically to generate the letter ordering (dictionary).

Let’s look at each part in more depth.

Part 1: Identifying the dependencies

Let’s start with example words and observe the initial ordering through simple reasoning:

["mzosr", "mqov", "xxsvq", "xazv", "xazau", "xaqu", "suvzu", "suvxq", "suam", "suax", "rom", "rwx", "rwv"]

["m", "m", "x", "x", "x", "x", "s", "s", "s", "s", "r", "r", "r"]

Removing the duplicates, we get the following:

["m", "x", "s", "r"]

Following the intuition explained above, we can assume that the first letters in the messages are in alphabetical order:

Note: Notice that we didn’t mention rules such as “m -> a”. This is fine because we can derive this relation from “m -> x”, “x -> a”.

This is it for the first part. Let’s put the pieces that we have in place.

Part 2: Representing the dependencies

We now have a set of relations mentioning the relative order of the pairs of letters:

["z -> q", "m -> x", "x -> a", "x -> v", "x -> s", "z -> x", "v -> a", "s -> r", "o -> w"]

Now the question arises, how can we put these relations together? It might be tempting to start chaining all these together. Let’s look at a few possible chains:

Part 3: Generating the dictionary

As we can see from the graph, four of the letters have no incoming arrows. This means that there are no letters that have to come before any of these four.

Remember: There could be multiple valid dictionaries, and if there are, then it’s fine for us to return any of them.

Therefore, a valid start to the ordering we return would be as follows:

["o", "m", "u", "z"]

We can now remove these letters and edges from the graph because any other letters that required them first will now have this requirement satisfied.

We can place the final two letters in our output list and return the ordering:

 ["o", "m", "u", "z", "x", "q", "w", "v", "s", "a", "r"]

Let’s now review how we can implement this approach.

Identifying the dependencies and representing them in the form of a graph is pretty straightforward. We extract the relations and insert them into an adjacency list:

Solution: Alien Dictionary

Statement▼

Solution

Naive approach

Optimized approach using topological sort

Step-by-step solution construction

Solution: Alien Dictionary

Statement▼

Solution

Naive approach

Optimized approach using topological sort

Step-by-step solution construction