...

/

Introduction

Introduction

Real-world problems

The following are a few real-world problems that can be solved using the topological sort pattern.

Process scheduling

When a system is booted, the operating system needs to run a number of processes. Some processes have dependencies, which are specified using ordered pairs of the form (a, b); this means that process b must be run before process a. Some processes don’t have any dependencies, meaning they don’t have to wait for any processes to finish. Additionally, there cannot be any circular dependencies between the processes like (a, b)(b, a). In order to successfully start the system, the operating system needs to select an ordering to run the processes. The processes should be ordered in such a way that whenever a process is scheduled all of its dependencies are already met.

The processes can be represented in the form of a graph where the vertices are the processes and the edges represent their dependency relationships.

For example, in the following graph, there are six processes, numbered 1 through 6. Here, arrows are used to represent dependencies. For example, process 3 depends on both processes 2 and 5.

From the above example, we get the order: P6 ➔ P4 ➔ P1 ➔ P5 ➔ P2 ➔ P3. Another possible ordering of the above processes can be P6 ➔ P4 ➔ P5 ➔ P1 ➔ P2 ➔ P3. This order of graph vertices can be obtained through topological sort.

Find dictionary

We use a network protocol that encrypts all application messages using a proprietary scheme. The encryption scheme has a unique property that the sequence of encrypted messages in a session appears to be in sorted order according to a secret dictionary. However, the dictionary is not transmitted for security purposes.

Before the sender starts transmitting actual messages, it sends several encrypted training messages to the receiver. The sender guarantees that the training messages will follow the lexicographic order according to the unknown dictionary.

The receiver must reverse engineer the training messages and reconstruct the dictionary for future communication with the sender. If the order of the messages is invalid, the receiver generates an empty dictionary and asks the sender to retransmit the training messages.

For simplicity’s sake, we can assume that the encrypted contents of the messages only consist of English lowercase letters.

We can essentially map this problem to a graph problem, where we:

  • Extract the necessary information to identify the dependency rules from the messages.
  • With the gathered information, we can put these dependency rules into a directed graph with the letters as nodes and the dependencies (order) as the edges.
  • Lastly, we can sort the graph nodes topologically to generate the letter ordering (dictionary).

Consider the following illustration for first letter ordering:

We can assume the first letters in the messages are in alphabetical order.

Looking at the letters above, we know the relative order of these letters but do not know how these letters fit in with the rest of the letters. To get more information, we’ll need to look further into our English dictionary analogy. The word dirt comes before dorm. This is because we look at the second letter when the first letter is the same. In this case, i comes before o in the alphabet.

We can apply the same logic to our encrypted messages and look at the first two messages, mzsor and mqov. As the first letter is the same in both the messages, we look at the second letter. The first message has z, and the other second one has q. Therefore, we can safely say that z comes before q. We now have two fragments of the letter-order:

["m", "x", "s", "r"], ["z", "q"]

Let’s have a look at all the relations we can extract by comparing adjacent messages:

g cluster array messages[i] mzosr mqov xxsvq xazv xazau xaqu suvzu suvxq rom rwx messages[i+1] mqov xxsvq xazv xazau xaqu suvzu suvxq rom rwx rwv array3 Inferred rule z->q m->x x->a v->a z->q x->s z->x s->r o->w x->v

The above rules can be converted into a graph as follows:

Dry-run templates

The following is the implementation of the solution provided for the Topological Sort (medium) problem:

Press + to interact
Python 3.8
from collections import deque
def topological_sort(vertices, edges):
sortedOrder = []
if vertices <= 0:
return sortedOrder
# a. Initialize the graph
inDegree = {i: 0 for i in range(vertices)} # count of incoming edges
graph = {i: [] for i in range(vertices)} # adjacency list graph
# b. Build the graph
for edge in edges:
parent, child = edge[0], edge[1]
graph[parent].append(child) # put the child into it's parent's list
inDegree[child] += 1 # increment child's inDegree
# c. Find all sources i.e., all vertices with 0 in-degrees
sources = deque()
for key in inDegree:
if inDegree[key] == 0:
sources.append(key)
# d. For each source, add it to the sortedOrder and subtract one from all of its children's in-degrees
# if a child's in-degree becomes zero, add it to the sources queue
while sources:
vertex = sources.popleft()
sortedOrder.append(vertex)
for child in graph[vertex]: # get the node's children to decrement their in-degrees
inDegree[child] -= 1
if inDegree[child] == 0:
sources.append(child)
# topological sort is not possible as the graph has a cycle
if len(sortedOrder) != vertices:
return []
return sortedOrder
def main():
print("Topological sort: " +
str(topological_sort(4, [[3, 2], [3, 0], [2, 0], [2, 1]])))
print("Topological sort: " +
str(topological_sort(5, [[4, 2], [4, 3], [2, 0], [2, 1], [3, 1]])))
print("Topological sort: " +
str(topological_sort(7, [[6, 4], [6, 2], [5, 3], [5, 4], [3, 0], [3, 1], [3, 2], [4, 1]])))
main()

Sample input #1

Input: Vertices=4, Edges=[3, 2], [3, 0], [2, 0], [2, 1]

Note: The column Outer loops represents the loops in our algorithm and their iteration number. For example, column entry while: 2 represents the second iteration of the while loop.

Since, there are 3 loops, the column Line number specifies the loop we’re considering. The column Inner loop iteration represents the nested for loop (line 30 - line 33).

Line number

Outer loops

Inner loop iteration

graph

parent

child

inDegree[child]

inDegree

sources

vertex

sortedOrder

5-11



{0: [], 1: [], 2: [], 3: []}




{0: 0, 1: 0, 2: 0, 3: 0}



[ ]

14-17

for: 1


{0: [], 1: [], 2: [], 3: [2]}

3

2


{0: 0, 1: 0, 2: 1, 3: 0}



[ ]

14-17

for: 2


{0: [], 1: [], 2: [], 3: [2, 0]}

3

0


{0: 1, 1: 0, 2: 1, 3: 0}



[ ]

14-17

for: 3


{0: [], 1: [], 2: [0], 3: [2, 0]}

2

0


{0: 2, 1: 0, 2: 1, 3: 0}



[ ]

14-17

for: 4


{0: [], 1: [], 2: [0, 1], 3: [2, 0]}

2

1


{0: 2, 1: 1, 2: 1, 3: 0}



[ ]

21-23

for: 1


{0: [], 1: [], 2: [0, 1], 3: [2, 0]}




{0: 2, 1: 1, 2: 1, 3: 0}

[ ]


[ ]

21-23

for: 2


{0: [], 1: [], 2: [0, 1], 3: [2, 0]}




{0: 2, 1: 1, 2: 1, 3: 0}

[ ]


[ ]

21-23

for: 3


{0: [], 1: [], 2: [0, 1], 3: [2, 0]}




{0: 2, 1: 1, 2: 1, 3: 0}

[ ]


[ ]

21-23

for: 4


{0: [], 1: [], 2: [0, 1], 3: [2, 0]}




{0: 2, 1: 1, 2: 1, 3: 0}

[3]


[ ]

27-33

while: 1

1

{0: [], 1: [], 2: [0, 1], 3: [2, 0]}


2

0

{0: 2, 1: 1, 2: 0, 3: 0}

[2]

3

[3]

27-33

while: 1

2

{0: [], 1: [], 2: [0, 1], 3: [2, 0]}


0

1

{0: 1, 1: 1, 2: 0, 3: 0}

[2]

3

[3]

27-33

while: 2

1

{0: [], 1: [], 2: [0, 1], 3: [2, 0]}


0

0

{0: 0, 1: 1, 2: 0, 3: 0}

[0]

2

[3, 2]

27-33

while: 2

2

{0: [], 1: [], 2: [0, 1], 3: [2, 0]}


1

0

{0: 0, 1: 0, 2: 0, 3: 0}

[0, 1]

2

[3, 2]

27-33

while: 3


{0: [], 1: [], 2: [0, 1], 3: [2, 0]}




{0: 0, 1: 0, 2: 0, 3: 0}

[1]

0

[3, 2, 0]

27-33

while: 4


{0: [], 1: [], 2: [0, 1], 3: [2, 0]}




{0: 0, 1: 0, 2: 0, 3: 0}

[ ]

1

[3, 2, 0, 1]

Sample input #2

Students may be encouraged to fill the dry-run table with the following input:

Input: Vertices=5, Edges=[4, 2], [4, 3], [2, 0], [2, 1], [3, 1]

Line number

Outer loops

Inner loop iteration

graph

parent

child

inDegree[child]

inDegree

sources

vertex

sortedOrder

5-11



{0: [], 1: [], 2: [], 3: [], 4: []}




{0: 0, 1: 0, 2: 0, 3: 0, 4: 0}



[ ]

14-17

for: 1


{0: [], 1: [], 2: [], 3: [], 4: [2]}

4

2


{0: 0, 1: 0, 2: 1, 3: 0, 4: 0}



[ ]

14-17

for: 2


{0: [], 1: [], 2: [], 3: [], 4: [2, 3]}

4

3


{0: 0, 1: 0, 2: 1, 3: 1, 4: 0}



[ ]

...

...

...

...

...

...

...

...

...

...

...

Step-by-step solution construction

The first step is to initialize the graph with vertices and empty adjacency lists. We also maintain a dictionary inDegree that will store the number of incoming edges for each vertex. However, in this step, we’ll initialize it to 0.

Press + to interact
Python 3.8
from collections import deque
def topological_sort(vertices, edges):
print("Given edges: ", edges, ", Vertices: ", vertices, sep = "")
print("")
sortedOrder = []
if vertices <= 0:
return sortedOrder
# a. Initialize the graph
print("Initializing inDegree")
inDegree = {}
for i in range(vertices):
print("\tLoop iteration: ", i+1, sep = "")
print("\t\tAdding vertex ", i, " to inDegree and initialising it to 0", sep = "")
inDegree[i] = 0
print("\t\tinDegree: ", inDegree, sep = "")
print("")
print("")
print("Initializing graph")
graph = {}
for i in range(vertices):
print("\tLoop iteration: ", i+1, sep = "")
print("\t\tAdding vertex ", i, " to the graph and initialising it with an empty adjacency list", sep = "")
graph[i] = []
print("\t\tgraph: ", graph, sep = "")
print("")
print("")
return sortedOrder
def main():
inputgraphs = [(4, [[3, 2], [3, 0], [2, 0], [2, 1]]),(5, [[4, 2], [4, 3], [2, 0], [2, 1], [3, 1]]),(7, [[6, 4], [6, 2], [5, 3], [5, 4], [3, 0], [3, 1], [3, 2], [4, 1]]) ]
for i in inputgraphs:
print("Topological sort: ", topological_sort(i[0], i[1]))
print("-"*100+"\n")
main()

Next, we build the graph. We put the child into its parent’s adjacency list and increment the child’s indegree by 1. This implies that there is an edge going from the parent to the child, that is, parent -> child.

Press + to interact
Python 3.8
from collections import deque
def topological_sort(vertices, edges):
print("Given edges: ", edges, ", Vertices: ", vertices, sep = "")
print("")
sortedOrder = []
if vertices <= 0:
return sortedOrder
# a. Initialize the graph
print("Initializing inDegree")
inDegree = {}
for i in range(vertices):
inDegree[i] = 0
print("\tinDegree: ", inDegree, sep = "")
print("")
print("Initializing graph")
graph = {}
for i in range(vertices):
graph[i] = []
print("\tgraph: ", graph, sep = "")
print("")
print("Building the graph")
k = 1
for edge in edges:
print("\tLoop iteration: ", k, sep ="")
k+=1
parent, child = edge[0], edge[1]
print("\t\tparent: ", parent, ", child: ", child, sep = "")
graph[parent].append(child) # put the child into it's parent's list
print("\t\tAdding child to the parent's list: ", graph, sep = "")
temp = inDegree[child]
inDegree[child] += 1 # increment child's inDegree
print("\t\tIncrementing child's inDegree: ", temp, " + 1 --> ", inDegree[child], sep = "")
print("\t\tinDegree: ", inDegree)
print("")
print("")
return sortedOrder
def main():
inputgraphs = [(4, [[3, 2], [3, 0], [2, 0], [2, 1]]),(5, [[4, 2], [4, 3], [2, 0], [2, 1], [3, 1]]),(7, [[6, 4], [6, 2], [5, 3], [5, 4], [3, 0], [3, 1], [3, 2], [4, 1]]) ]
for i in inputgraphs:
print("Topological sort: ", topological_sort(i[0], i[1]))
print("-"*100+"\n")
main()

The next step is to find all the sources in the graph, that is, vertices with inDegree = 0. Since there are no edges going in these vertices, we can assume they occupy the first level of our graph. We’ll store these sources in a dequeue() variable, source.

Press + to interact
Python 3.8
from collections import deque
def topological_sort(vertices, edges):
print("Given edges: ", edges, ", Vertices: ", vertices, sep = "")
print("")
sortedOrder = []
if vertices <= 0:
return sortedOrder
# a. Initialize the graph
print("Initializing inDegree")
inDegree = {}
for i in range(vertices):
inDegree[i] = 0
print("\tinDegree: ", inDegree, sep = "")
print("")
print("Initializing graph")
graph = {}
for i in range(vertices):
graph[i] = []
print("\tgraph: ", graph, sep = "")
print("")
print("Building the graph")
k = 1
for edge in edges:
print("\tLoop iteration: ", k, sep ="")
k+=1
parent, child = edge[0], edge[1]
print("\t\tparent: ", parent, ", child: ", child, sep = "")
graph[parent].append(child) # put the child into it's parent's list
print("\t\tgraph: ", graph, sep = "")
temp = inDegree[child]
inDegree[child] += 1 # increment child's inDegree
print("\t\tinDegree: ", inDegree)
print("")
print("")
print("Finding the sources")
sources = deque()
l = 1
for key in inDegree:
print("\tLoop iteration: ", l, sep = "")
l+=1
if inDegree[key] == 0:
print("\t\tinDegree of vertex ", key, " = ", inDegree[key], " --> vertex is a source", sep = "")
sources.append(key)
print("\t\tsources: ", sources, sep = "")
else:
print("\t\tinDegree of vertex ", key, " = ", inDegree[key], " --> not 0, hence, not a source", sep = "")
print("")
print("")
return sortedOrder
def main():
inputgraphs = [(4, [[3, 2], [3, 0], [2, 0], [2, 1]]),(5, [[4, 2], [4, 3], [2, 0], [2, 1], [3, 1]]),(7, [[6, 4], [6, 2], [5, 3], [5, 4], [3, 0], [3, 1], [3, 2], [4, 1]]) ]
for i in inputgraphs:
print("Topological sort: ", topological_sort(i[0], i[1]))
print("-"*100+"\n")
main()

Now that our setup is complete, we’ll get into the actual algorithm to get the topological sorting. The key idea is to re-use the process of identifying sources. If we look at this graph again:

Supposing we were to remove the first set of sources {6} from the graph, the next set of sources would be {4}. If we then removed these from the graph, the next set of sources would be {1, 5}, and so on. Each of these sets is like a layer in the graph that comes into view when the previous layer is removed.

For each source, we’ll add it to an array sortedOrder and, to simulate the idea of removing it from the graph, subtract one from all of its children’s in-degrees. ​​If a child’s in-degree becomes zero, we add it to the sources queue.

The in-degrees are decremented since now the child has one less edge coming into it.

If the length of sortedOrder is not equal to the number of vertices, the graph has a cycle and the program returns an empty array.

Press + to interact
Python 3.8
from collections import deque
import copy
def topological_sort(vertices, edges):
print("Given edges: ", edges, ", Vertices: ", vertices, sep = "")
print("")
sortedOrder = []
if vertices <= 0:
return sortedOrder
# a. Initialize the graph
print("Initializing inDegree")
inDegree = {}
for i in range(vertices):
inDegree[i] = 0
print("\tinDegree: ", inDegree, sep = "")
print("")
print("Initializing graph")
graph = {}
for i in range(vertices):
graph[i] = []
print("\tgraph: ", graph, sep = "")
print("")
print("Building the graph")
k = 1
for edge in edges:
print("\tLoop iteration: ", k, sep ="")
k+=1
parent, child = edge[0], edge[1]
print("\t\tparent: ", parent, ", child: ", child, sep = "")
graph[parent].append(child) # put the child into it's parent's list
print("\t\tgraph: ", graph, sep = "")
temp = inDegree[child]
inDegree[child] += 1 # increment child's inDegree
print("\t\tinDegree: ", inDegree)
print("")
print("")
print("Finding the sources")
sources = deque()
l = 1
for key in inDegree:
print("\tLoop iteration: ", l, sep = "")
l+=1
if inDegree[key] == 0:
print("\t\tinDegree of vertex ", key, " = ", inDegree[key], " --> vertex is a source", sep = "")
sources.append(key)
print("\t\tsources: ", sources, sep = "")
else:
print("\t\tinDegree of vertex ", key, " = ", inDegree[key], " --> not 0, hence, not a source", sep = "")
print("")
print("")
print("Getting the topological ordering")
m = 1
while sources:
print("\tOuter loop iteration: ", m)
m+=1
tempsource = deque()
tempsource = copy.copy(sources)
vertex = sources.popleft()
print("\t\tPopping from sources: ", tempsource, " --> ", sources, sep = "")
print("\t\tvertex: ", vertex, sep = "")
temporder = []
temporder = copy.copy(sortedOrder)
sortedOrder.append(vertex)
print("\t\tAppending to sortedOrder: ", temporder, " --> ", sortedOrder, sep = "")
n = 1
for child in graph[vertex]: # get the node's children to decrement their in-degrees
print("\t\t\tInner loop iteration: ", n)
n+=1
print("\t\t\t\tchild: ", child, sep="")
print("\t\t\t\tDecrementing the indegree of vertex ", child, ": ", inDegree[child], " --> ", inDegree[child] - 1, sep = "")
inDegree[child] -= 1
if inDegree[child] == 0:
print("\t\t\t\tindegree of ", child, " = 0", sep = "")
temps = []
temps = copy.copy(sources)
sources.append(child)
print("\t\t\t\tAppending to sources: ", temps, " --> ", sources, sep = "")
print("")
print("")
# topological sort is not possible as the graph has a cycle
if len(sortedOrder) != vertices:
print("Topological sort is not possible as the graph has a cycle.")
return []
return sortedOrder
def main():
inputgraphs = [(4, [[3, 2], [3, 0], [2, 0], [2, 1]]),(5, [[4, 2], [4, 3], [2, 0], [2, 1], [3, 1]]),(7, [[6, 4], [6, 2], [5, 3], [5, 4], [3, 0], [3, 1], [3, 2], [4, 1]]) ]
for i in inputgraphs:
print("Topological sort: ", topological_sort(i[0], i[1]))
print("-"*100+"\n")
main()