Search⌘ K
AI Features

Analyzing Algorithms

Explore how to rigorously analyze algorithms by proving correctness using induction and assessing their efficiency through running time and resource usage. Understand asymptotic analysis and how it applies to different algorithmic problems to improve your problem-solving skills in Python.

It’s not enough just to write down an algorithm and say, “Behold!” We must also convince our audience (and ourselves!) that the algorithm actually does what it’s supposed to do and that it does so efficiently.

Correctness

In some application settings, it is acceptable for programs to behave correctly most of the time, on all “reasonable” inputs. Not in this course: we require algorithms that are always correct, for all possible inputs. Moreover, we must prove that our algorithms are correct; trusting our instincts, or trying a few test cases, isn’t good enough. Sometimes correctness is truly obvious. On the other hand, “obvious” is all too often a synonym for “wrong.” Most of the algorithms we discuss in this course require real work to be proven correct. In particular, correctness proofs usually involve induction. Induction is an incredibly useful method.

Of course, before we can formally prove that our algorithm does what it’s supposed to do, we have to formally describe what it’s supposed to do.

Running time

The most common way of ranking different algorithms for the same problem is by how quickly they run. Ideally, we want the fastest possible algorithm for any particular problem. In many application settings, it is acceptable for programs to run efficiently most of the time on all “reasonable” inputs. Not in this course—we require algorithms that always run efficiently, even in the worst case.

But how do we measure running time? As a specific example, how long does it take to sing the song BottlesOfBeer(n)BottlesOfBeer(n)? This is obviously a function of the input value nn, but it also depends on how quickly someone can sing. Some singers might take ten seconds to sing a verse; others might take twenty. Technology widens the possibilities even further. Dictating the song over a telegraph using Morse code might take a full minute per verse. Downloading an mp3 over the Web might take a tenth of a second per verse. Duplicating the mp3 in a computer’s main memory might take only a few microseconds per verse.

What’s important here is how the singing time changes as nn grows. Singing BottlesOfBeer(2n)BottlesOfBeer(2n) requires about twice as much time as singing BottlesOfBeer(n)BottlesOfBeer(n), no matter what technology is being used. This is reflected in the asymptotic singing time Θ(n)Θ(n).

We can measure time by counting how many times the algorithm executes a certain instruction or reaches a certain milestone in the “code.” For example, we might notice that the word “beer” is sung three times in every verse of “Bottles of Beer,” so the number of times you sing “beer” is a good indication of the total singing time. For this question, we can give an exact answer: BottlesOfBeer(n)BottlesOfBeer(n) mentions beer exactly 3n+33n + 3 times.

Incidentally, there are lots of songs with quadratic singing time. This one is probably familiar to most English speakers:

Algorithm


NDaysOfChristmas(gifts[2..n]):for inSing “On the ith day of Christmas, my true love gave to mefor ji down to 2Sing “j gifts[j],if i>1Sing “andSing “a partridge in a pear tree.\underline{NDaysOfChristmas(gifts[2 .. n]):} \\ \hspace{0.4cm} for\space i←n \\ \hspace{1cm} Sing\space “On\space the\space ith\space day\space of\space Christmas,\space my\space true\space love\space gave\space to\space me” \\ \hspace{1cm} for\space j ← i\space down\space to\space 2 \\ \hspace{1.7cm} Sing\space “j\space gifts[j],” \\ \hspace{1cm} if\space i > 1 \\ \hspace{1.7cm} Sing\space “and” \\ \hspace{1cm} Sing\space “a\space partridge\space in\space a\space pear\space tree.”


Implementation

Python 3.8
def NDaysOfChristmas(gifts, n):
for i in range(n, 1, -1):
print(f"On the {i}th day of Christmas, my true love gave to me")
for j in range(i, 2, -1):
print(f"{j} {gifts[j]}, ", end="")
if i > 1:
print("and ")
print("a partridge in a pear tree.")
gifts = [
"",
"two turtle doves",
"three french hens",
"four calling birds",
"five golden rings",
"six geese a laying",
"seven swans a swimming",
"eight maids a milking",
"nine ladies dancing",
"ten lords a leaping",
"eleven pipers piping",
"twelve drummers drumming",
]
n = len(gifts) - 1
NDaysOfChristmas(gifts, n)

Explanation

  • Line 2: We set up a loop that will iterate over the range of numbers from n to 2 (inclusive) in reverse order.

  • Lines 4–5: We print the gifts for each day of Christmas. The end="" argument to the print function is used to prevent it from printing a newline character after each gift.

  • Line 25: We set the variable n to the length of the gifts list minus 1, which is the number of days of Christmas to print.

The input to NDaysOfChristmasNDaysOfChristmas is a list of n1n − 1 gifts, represented here as an array. It’s quite easy to show that the singing time is Θ(n2Θ(n^2); in particular, the singer mentions the name of a gift i=1ni=n(n+1)/2\sum_{i=1}^{n} i = n(n + 1)/2 times (counting the partridge in the pear tree). It’s also easy to see that during the first n days of Christmas, “my true love gave to me exactly i=1nj=1ij=n(n+1)(n+2)/6=Θ(n3)\sum_{i=1}^{n} \sum_{j=1}^{i} j = n(n + 1)(n + 2)/6 = Θ(n^3) gifts.”

Other quadratic-time songs include “Old MacDonald Had a Farm,” “There Was an Old Lady Who Swallowed a Fly,” “Hole in the Bottom of the Sea,” “Green Grow the Rushes O,” “The Rattlin’ Bog,” “The Court Of King Caractacus,” “The Barley-Mow,” “If I Were Not Upon the Stage,” “Star Trekkin’,” “Ist das nicht ein Schnitzelbank?”,22“Il Pulcino Pio,” “Minkurinn í hænsnakofanum,” “Echad Mi Yodea,” and “ To κoκoϱˊακι\kappa o\kappa o\varrho \acute{}\alpha \kappa \iota.” For more examples, consult your favorite preschooler.

Algorithm


Alouette(lapart[1..n]):Chantez « Alouette, gentille alouette, alouette, je te plumerai. »pour tout i de 1 aˋ nChantez « Je te plumerai lapart[i]. Je te plumerai lapart[i]. »pour tout j de i aˋ 1〈〈aˋrebours〉〉Chantez « Et lapart[j]! Et lapart[j]! »Chantez « Alouette! Alouette! Aaaaaa... »Chantez « ...alouette, gentille allouette, alouette, je te plumerai. »\underline{Alouette(lapart[1 .. n]):} \\ \hspace{0.4cm} \text{Chantez}\space «\space Alouette,\space gentille\space alouette,\space alouette,\space je\space te\space plumerai.\space » \\ \hspace{0.45cm} \text{pour tout i de 1 à n} \\ \hspace{1cm} \text{Chantez } «\space Je\space te\space plumerai\space lapart[i].\space Je\space te\space plumerai\space lapart[i].\space » \\ \hspace{1cm} \text{pour tout j de i à 1} 〈〈à rebours〉〉 \\ \hspace{1.7cm} \text{Chantez } «\space Et\space lapart[ j] !\space Et\space lapart[ j] !\space » \\ \hspace{1cm} \text{Chantez } «\space Alouette!\space Alouette!\space Aaaaaa. . . \space» \\ \hspace{1cm} \text{Chantez } «\space . . . alouette,\space gentille\space allouette,\space alouette,\space je\space te\space plumerai.\space »


Implementation

Python 3.8
def Alouette(lapart, n):
print('Chantez "Alouette, gentille alouette, alouette, je te plumerai."')
for i in range(n):
print(f"Je te plumerai {lapart[i]}.")
for j in range(i, -1, -1):
print(f"Et {lapart[j]} ! ", end="")
print("\nAlouette! Alouette! Aaaaaa. . . ")
print("Alouette, gentille alouette, alouette, je te plumerai.")
lapart = [
"la tête",
"le bec",
"les yeux",
"le cou",
"les ailes",
"les pattes",
"la queue",
"le dos",
]
n = len(lapart)
Alouette(lapart, n)

Explanation

  • Line 1: We define a function called Alouette that takes in two parameters: lapart, which is a list of strings representing the different parts of a bird’s body, and n, which is the length of the lapart list.

  • Lines 3–7: We loop through the lapart list and print out each line of the song. For each part of the bird’s body in lapart, the code prints out “Je te plumerai [part of the body].” Then, it loops backwards through the lapart list (starting from the current index i and going down to 0) and prints out "Et [part of the body] ! " for each previous part of the bird’s body. Finally, it prints out "Alouette! Alouette! Aaaaaa. . . " at the end of each verse.

Analysis of algorithmic running times

A few songs have even more bizarre singing times. A fairly modern example is “The Telnet Song” by Guy Steele, which actually takes Θ(2n)Θ(2n) time to sing the first nn verses; Steele recommended n=4n = 4. Finally, there are some songs that never end.

Except for “The Telnet Song”, all of these songs are most naturally expressed as a small set of nested loops, so their running singing times can be computed using nested summations. The running time of a recursive algorithm is more easily expressed as a recurrence. For example, the peasant multiplication algorithm can be expressed recursively as follows:

x.y={0 if x=0[x/2].(y+y) if x is even[x/2].(y+y)+y if x is odd x.y = \begin{cases} 0& \text{ if } x=0 \\ [x/2].(y+y)& \text{ if $x$ is even} \\ [x/2].(y+y)+y& \text{ if $x$ is odd } \end{cases}

Let T(x,y)T(x, y) denote the number of parity, addition, and mediation operations required to compute xyx · y. This function satisfies the recursive inequality T(x,y)T(x/2,2y)+2T(x,y)≤T(⌊x/2⌋,2y)+2 with base case T(0,y)=0T(0,y)=0. Techniques described in the next chapter imply the upper bound T(x,y)=O(logx)T(x, y) = O(\log x).

Sometimes, the running time of an algorithm depends on a particular implementation of some underlying data structure of subroutine. For example, the Huntington-Hill apportionment algorithm ApportionCongress runs in O(N+RI+(Rn)E)O(N + RI + (R − n)E) time, where NN denotes the running time of NewPriority- Queue, II denotes the running time of Insert, and EE denotes the running time of ExtractMax. Under the reasonable assumption that R2nR ≥ 2n (on average, each state gets at least two representatives), we can simplify this bound to O(N+R(I+E))O(N + R(I + E)). The precise running time depends on the implementation of the underlying priority queue. The Census Bureau implements the priority queue as an unsorted array, which gives us N=I=Θ(1)N = I = Θ(1) and E=Θ(n)E = Θ(n), so the Census Bureau’s implementation of ApportionCongress runs in O(Rn)O(Rn) time. However, if we implement the priority queue as a binary heap or a heap-ordered array, we have N=Θ(1)N = Θ(1) and I=E=O(logn)I = E = O(\log n), so the overall algorithm runs in O(Rlogn)O(R\log n) time.

Finally, sometimes we are interested in computational resources other than time, such as space, the number of coin flips, the number of cache or page faults, the number of inter-process messages, or “the number of gifts my true love gave to me.” These resources can be analyzed using the same techniques used to analyze running time. For example, lattice multiplication of two nn-digit numbers requires O(n2)O(n^2) space if we write down all the partial products before adding them but only O(n)O(n) space if we add them on the fly.