Analyzing Algorithms
Explore how to rigorously analyze algorithms by proving correctness using induction and assessing their efficiency through running time and resource usage. Understand asymptotic analysis and how it applies to different algorithmic problems to improve your problem-solving skills in Python.
We'll cover the following...
It’s not enough just to write down an algorithm and say, “Behold!” We must also convince our audience (and ourselves!) that the algorithm actually does what it’s supposed to do and that it does so efficiently.
Correctness
In some application settings, it is acceptable for programs to behave correctly most of the time, on all “reasonable” inputs. Not in this course: we require algorithms that are always correct, for all possible inputs. Moreover, we must prove that our algorithms are correct; trusting our instincts, or trying a few test cases, isn’t good enough. Sometimes correctness is truly obvious. On the other hand, “obvious” is all too often a synonym for “wrong.” Most of the algorithms we discuss in this course require real work to be proven correct. In particular, correctness proofs usually involve induction. Induction is an incredibly useful method.
Of course, before we can formally prove that our algorithm does what it’s supposed to do, we have to formally describe what it’s supposed to do.
Running time
The most common way of ranking different algorithms for the same problem is by how quickly they run. Ideally, we want the fastest possible algorithm for any particular problem. In many application settings, it is acceptable for programs to run efficiently most of the time on all “reasonable” inputs. Not in this course—we require algorithms that always run efficiently, even in the worst case.
But how do we measure running time? As a specific example, how long does it take to sing the song ? This is obviously a function of the input value , but it also depends on how quickly someone can sing. Some singers might take ten seconds to sing a verse; others might take twenty. Technology widens the possibilities even further. Dictating the song over a telegraph using Morse code might take a full minute per verse. Downloading an mp3 over the Web might take a tenth of a second per verse. Duplicating the mp3 in a computer’s main memory might take only a few microseconds per verse.
What’s important here is how the singing time changes as grows. Singing requires about twice as much time as singing , no matter what technology is being used. This is reflected in the asymptotic singing time .
We can measure time by counting how many times the algorithm executes a certain instruction or reaches a certain milestone in the “code.” For example, we might notice that the word “beer” is sung three times in every verse of “Bottles of Beer,” so the number of times you sing “beer” is a good indication of the total singing time. For this question, we can give an exact answer: mentions beer exactly times.
Incidentally, there are lots of songs with quadratic singing time. This one is probably familiar to most English speakers:
Algorithm
Implementation
Explanation
-
Line 2: We set up a loop that will iterate over the range of numbers from
nto 2 (inclusive) in reverse order. -
Lines 4–5: We print the gifts for each day of Christmas. The
end=""argument to the print function is used to prevent it from printing a newline character after each gift. -
Line 25: We set the variable
nto the length of the gifts list minus1, which is the number of days of Christmas to print.
The input to is a list of gifts, represented here as an array. It’s quite easy to show that the singing time is ); in particular, the singer mentions the name of a gift times (counting the partridge in the pear tree). It’s also easy to see that during the first n days of Christmas, “my true love gave to me exactly gifts.”
Other quadratic-time songs include “Old MacDonald Had a Farm,” “There Was an Old Lady Who Swallowed a Fly,” “Hole in the Bottom of the Sea,” “Green Grow the Rushes O,” “The Rattlin’ Bog,” “The Court Of King Caractacus,” “The Barley-Mow,” “If I Were Not Upon the Stage,” “Star Trekkin’,” “Ist das nicht ein Schnitzelbank?”,22“Il Pulcino Pio,” “Minkurinn í hænsnakofanum,” “Echad Mi Yodea,” and “ To .” For more examples, consult your favorite preschooler.
Algorithm
Implementation
Explanation
-
Line 1: We define a function called
Alouettethat takes in two parameters:lapart, which is a list of strings representing the different parts of a bird’s body, andn, which is the length of thelapartlist. -
Lines 3–7: We loop through the
lapartlist and print out each line of the song. For each part of the bird’s body inlapart, the code prints out “Je te plumerai [part of the body].” Then, it loops backwards through thelapartlist (starting from the current index i and going down to 0) and prints out "Et [part of the body] ! " for each previous part of the bird’s body. Finally, it prints out "Alouette! Alouette! Aaaaaa. . . " at the end of each verse.
Analysis of algorithmic running times
A few songs have even more bizarre singing times. A fairly modern example is “The Telnet Song” by Guy Steele, which actually takes time to sing the first verses; Steele recommended . Finally, there are some songs that never end.
Except for “The Telnet Song”, all of these songs are most naturally expressed as a small set of nested loops, so their running singing times can be computed using nested summations. The running time of a recursive algorithm is more easily expressed as a recurrence. For example, the peasant multiplication algorithm can be expressed recursively as follows:
Let denote the number of parity, addition, and mediation operations required to compute . This function satisfies the recursive inequality with base case . Techniques described in the next chapter imply the upper bound .
Sometimes, the running time of an algorithm depends on a particular implementation of some underlying data structure of subroutine. For example, the Huntington-Hill apportionment algorithm ApportionCongress runs in time, where denotes the running time of NewPriority- Queue, denotes the running time of Insert, and denotes the running time of ExtractMax. Under the reasonable assumption that (on average, each state gets at least two representatives), we can simplify this bound to . The precise running time depends on the implementation of the underlying priority queue. The Census Bureau implements the priority queue as an unsorted array, which gives us and , so the Census Bureau’s implementation of ApportionCongress runs in time. However, if we implement the priority queue as a binary heap or a heap-ordered array, we have and , so the overall algorithm runs in time.
Finally, sometimes we are interested in computational resources other than time, such as space, the number of coin flips, the number of cache or page faults, the number of inter-process messages, or “the number of gifts my true love gave to me.” These resources can be analyzed using the same techniques used to analyze running time. For example, lattice multiplication of two -digit numbers requires space if we write down all the partial products before adding them but only space if we add them on the fly.