...

/

Introduction

Introduction

Real-world problems

The following are a few real-world problems that can be solved using the modified binary search tree pattern.

Order processing milestones

Assume we want to collect stats of the number of orders processed for an e-commerce business. These stats will be presented in a quarterly report using a histogram. To collect data for this presentation, the devs have stored the number of orders for each day rounded down to the nearest significant milestone. For example, when we meet or exceed one million orders, we display “1M+ orders processed.” We continue to display the same message until another milestone is reached. For instance, when nine million orders or more are processed, we display “9M+ orders processed.” Every day, the rounded cumulative number of orders processed is appended to an array.

A modified binary search tree can be used to find the number of days spent jumping from one milestone to the next. We will have an array containing the daily milestones, that resets each quarter, so we start from 0 million.

Let’s say the array is: {0, 1, 1, 2, 2, 2, 3, 4, 4, 4, 5, 5, 6, 7}. This array shows the stats for fourteen days. Now, we will also have a value representing a certain milestone; suppose it is 4. This means that we need to find when the “4M+ orders processed” milestone occurred. The modified binary approach would return {7, 9} which means that we remained at the 4M+ milestone from the 7th day to the 9th day. The first day of the quarter is considered the 0th day.

Kth missing gene

A planet has n genes, numbered 1 to n. Every species on the planet has a subset of these genes in its DNA sequence. For a given species’ DNA, we want to find the kthk^{th} missing gene in a sorted list of genes. The given DNA sequence will be sorted in strictly increasing order.

Since our input is in ascending order, we can use a binary search tree to solve this problem.

Dry-run tables

The following is the implementation of the solution provided for the Ceiling of a Number (medium) problem:

Press + to interact
Python 3.5
def search_ceiling_of_a_number(arr, key):
n = len(arr)
if key > arr[n - 1]: # if the 'key' is bigger than the biggest element
return -1
start, end = 0, n - 1
while start <= end:
mid = start + (end - start) // 2
if key < arr[mid]:
end = mid - 1
elif key > arr[mid]:
start = mid + 1
else: # found the key
return mid
# since the loop is running until 'start <= end', so at the end of the while loop, 'start == end+1'
# we are not able to find the element in the given array, so the next big number will be arr[start]
return start
def main():
print(search_ceiling_of_a_number([4, 6, 10], 6))
print(search_ceiling_of_a_number([1, 3, 8, 10, 15], 12))
print(search_ceiling_of_a_number([4, 6, 10], 17))
print(search_ceiling_of_a_number([4, 6, 10], -1))
main()

Students may be encouraged to run through the provided solution with the following sample inputs:

Sample input #1

Input: [1, 3, 8, 10, 15], key = 12
  • start is updated when our key is greater than arr[mid].
  • end is updated when our key is less than arr[mid].

Line number

Loop iteration

start

end

mid

arr[mid]

6

-

0

4

-

-

7-14

1

3

4

2

8

7-14

2

4

4

3

10

7-14

3

4

3

4

15

18

-

4

-

-

-

Sample input #2

Students may be encouraged to complete the dry-run with this input:

Input: [4, 6, 10], key = -1

Line number

Loop iteration

start

end

mid

arr[mid]

6

-

0

2

-

-

7-14

...

...

...

...

...

Step-by-step solution generation

The first step in our algorithm is to check if our key is bigger than the last (largest) element in our input array. If yes, we return -1.

Since our array is sorted in ascending order, the last element is the largest element in the array.

Press + to interact
Python 3.5
def search_ceiling_of_a_number(arr, key):
n = len(arr)
print("Input array: ", arr, ", key: ", key, sep = "")
print("")
print("Length of the input array: ", n, sep = "")
print("Last element of the input array: ", arr[n-1], sep = "")
if key > arr[n - 1]: # if the 'key' is bigger than the biggest element
print("\t\t", key, " is greater than ", arr[n-1], " ---> returning -1", sep = "")
print("")
return -1
else:
print("\t\t", key, " is not greater than ", arr[n-1], " ---> continue", sep = "")
print("")
def main():
inputArrays = [([4, 6, 10], 6),([1, 3, 8, 10, 15], 12), ([4, 6, 10], 17), ([4, 6, 10], -1)]
for i in inputArrays:
index = search_ceiling_of_a_number(i[0], i[1])
if index != -1 and index is not None:
print("\nThe smallest number in ", i[0], " greater than or equal to ", i[1], " is ", i[0][index],", at index ", index, ".", sep = "")
else:
print("\nThere is no number greater than or equal to", i[1],"in the given array", i[0])
print("-"*100+"\n")
main()

The next step is to find the ceiling of the given key. We’ll iterate over our input array and maintain start, end, and mid indices. Here’s a rundown of what we’ll be doing:

  • Set start as 0, end as the length of the array - 1, and mid as the mid-point between the start and end indices.
  • If key is less than the middle element, update end with mid - 1, since key will be in the first half of the array.
  • If key is greater than the middle element, it implies that the key will be in the second half of the array. Hence, we update start with mid + 1.
  • Repeat the above steps until start <= end.
  • Return mid if the key is found, else return start, which will be pointing to the next number bigger than key.
Press + to interact
Python 3.5
def search_ceiling_of_a_number(arr, key):
n = len(arr)
print("Input array: ", arr, ", key: ", key, sep = "")
print("")
print("Length of the input array: ", n, sep = "")
print("Last element of the input array: ", arr[n-1], sep = "")
if key > arr[n - 1]: # if the 'key' is bigger than the biggest element
print("\t\t", key, " is greater than ", arr[n-1], " ---> returning -1", sep = "")
print("")
return -1
else:
print("\t\t", key, " is not greater than ", arr[n-1], " ---> continue", sep = "")
print("")
start, end = 0, n - 1
print("Start: ", start, ", end: ", end, sep = "")
j = 1
while start <= end:
print("\tLoop iteration: ", j, sep = "")
j+=1
mid = start + (end - start) // 2
print("\t\tmid: (start+end)/2 = ", mid, ", arr[mid]: ", arr[mid], sep = "")
if key < arr[mid]:
print("\t\tkey < arr[mid]: ", key, " < ", arr[mid], " ---> update end to search in first half of range", sep = "")
end = mid - 1
print("\t\tend: ", end, sep = "")
elif key > arr[mid]:
print("\t\tkey > arr[mid]: ", key, " > ", arr[mid], " ---> update start to search in second half of range", sep = "")
start = mid + 1
print("\t\tstart: ", start, sep = "")
else: # found the key
print("\t\tKey found ---> returning mid: ", mid, sep = "")
return mid
# since the loop is running until 'start <= end', so at the end of the while loop, 'start == end+1'
# we are not able to find the element in the given array, so the next big number will be arr[start]
print("key not present in the array ---> the next big element: ", arr[start], sep = "")
print("Returning the index of ", arr[start], ": ", start, sep = "")
return start
def main():
inputArrays = [([4, 6, 10], 6),([1, 3, 8, 10, 15], 12), ([4, 6, 10], 17), ([4, 6, 10], -1)]
for i in inputArrays:
index = search_ceiling_of_a_number(i[0], i[1])
if index != -1:
print("\nThe smallest number in ", i[0], " greater than or equal to ", i[1], " is ", i[0][index],", at index ", index, ".", sep = "")
else:
print("\nThere is no number greater than or equal to", i[1],"in the given array", i[0])
print("-"*100+"\n")
main()