Array Implementation of Stack
Explore the implementation of stacks using arrays in Python, focusing on the ArrayStack data structure. Understand core operations such as get, set, add, and remove, along with the amortized resizing technique that ensures efficient performance over multiple operations.
ArrayStack overview
The List and Queue interfaces can be implemented so that the underlying data is stored in an array, called the backing array. The following table summarizes the running times of operations for these data structures:
Advantages and limitations
Data structures that work by storing data in a single array have many advantages and limitations in common:
-
Arrays offer constant time access to any value in the array. This is what allows
get(i)andset(i, x)to run in constant time. -
Arrays are not very dynamic. Adding or removing an element near the middle of a list means that a large number of elements in the array need to be shifted to make room for the newly added element or to fill in the gap created by the deleted element. This is why the operations
add(i, x)andremove(i)have running times that depend on and . -
Arrays cannot expand or shrink. When the number of elements in the data structure exceeds the size of the backing array, a new array needs to be allocated and the data from the old array needs to be copied into the new array. This is an expensive operation.
The third point is important. The running times cited in the table above do not include the cost associated with growing and shrinking the backing array. We will see that, if carefully managed, the cost of growing and shrinking the backing array does not add much to the cost of an average operation. More precisely, if we start with an empty data structure, and perform any sequence of add(i, x) or remove(i) operations, then the total cost of growing and shrinking the backing array, over the entire sequence of m operations is . Although some individual operations are more expensive, the amortized cost, when amortized over all operations, is only per operation.
ArrayStack: fast stack operations using an array
An ArrayStack implements the list interface using an array , called the backing array. The list element with index is stored in a[i]. At most times, is larger than strictly necessary, so an integer is used to keep track of the number of elements actually stored in . In this way, the list elements are stored in and, at all times, .
The basics
Accessing and modifying the elements of an ArrayStack using get(i) and set(i, x) is trivial. After performing any necessary bounds-checking, we simply return or set, respectively, a[i].
The operations of adding and removing elements from an ArrayStack are illustrated below. A sequence of add(i, x) and remove(i) operations on an ArrayStack. Arrows denote elements being copied. Operations that result in a call to resize() are marked with an asterisk.
To implement the add(i, x) operation, we first check if is already full. If so, we call the method resize() to increase the size of . How resize() is implemented will be discussed later. For now, it is sufficient to know that, after a call to resize(), we can be sure that a.length > n. With this out of the way, we now shift the elements right by one position to make room for , set a[i] equal to , and increment .
If we ignore the cost of the potential call to resize(), then the cost of the add(i, x) operation is proportional to the number of elements we have to shift to make room for . Therefore the cost of this operation (ignoring the cost of resizing ) is .
Implementing the remove(i) operation is similar. We shift the elements left by one position (overwriting a[i]) and decrease the value of . After doing this, we check if is getting much smaller than a.length by checking if . If so, then we call resize() to reduce the size of .
If we ignore the cost of the resize() method, the cost of a remove(i) operation is proportional to the number of elements we shift, which is .
Growing and shrinking
The resize() method is fairly straightforward; it allocates a new array b whose size is and copies the elements of into the first positions in , and then sets to . Thus, after call to resize(), .
Analyzing the actual cost of the resize() operation is easy. It allocates an array of size and copies the elements of into . This takes time.
The running time analysis from the previous section ignored the cost of calls to resize(). In this section we analyze this cost using a technique known as amortized analysis. This technique does not try to determine the cost of resizing during each individual add(i, x) and remove(i) operation. Instead, it considers the cost of all calls to resize() during a sequence of calls to add(i, x) or remove(i). In particular, we will show:
Lemma: If an empty ArrayStack is created and any sequence of calls to add(i, x) and remove(i) are performed, then the total time spent during all calls to resize() is
Proof: We will show that any time resize() is called, the number of calls to add or remove since the last call to resize() is at least . Therefore, if denotes the value of during the th call to resize() and denotes the number of calls to resize(), then the total number of calls to add(i, x) or remove(i) is at least
which is equivalent to
On the other hand, the total time spent during all calls to resize() is
since is not more than . All that remains is to show that the number of calls to add(i, x) or remove(i) between the th and the th call to resize() is at least .
There are two cases to consider.
-
Case 1: In the first case,
resize()is being called byadd(i, x)because the backing array a is full, i.e., . Consider the previous call toresize(): after this previous call, the size of was , but the number of elements stored in was at most . But now the number of elements stored in is , so there must have been at least calls toadd(i, x)since the previous call toresize(). -
Case 2: The second case occurs when
resize()is being called byremove(i)because . Again, after the previous call toresize()the number of elements stored in was at least .
Note: The
-1in the above formula accounts for the special case that occurs whenn = 0anda.length = 1.
Now there are elements stored in . Therefore, the number of remove(i) operations since the last call to resize() is at least
In either case, the number of calls to add(i, x) or remove(i) that occur between the th call to resize() and the th call to resize() is at least , as required to complete the proof.
Analysis of ArrayStack
The following theorem summarizes the performance of an ArrayStack:
Theorem: An ArrayStack implements the List interface. Ignoring the
cost of calls to resize(), an ArrayStack supports the operations
get(i)andset(i, x)in time per operation; andadd(i, x)andremove(i)in time per operation.
Furthermore, beginning with an empty ArrayStack and performing any sequence of add(i, x) and remove(i) operations results in a total of time spent during all calls to resize().
The ArrayStack is an efficient way to implement a Stack. In particular, we can implement push(x) as add(n, x) and pop() as remove(n − 1), in which case these operations will run in amortized time.
Explanation
Let’s start our explanation from the start of main() in main.py file.
- Line 45: It creates a new
ArrayStackof integersstack, which will be used to store elements. - Lines 47–49: It adds three elements to
stack. The first argument to theadd()method is the index at which to add the element, and the second argument is the value of the element. - Line 50: It prints the current size of the
stackusing thesize()method. - Line 52–53: It removes an element from the
stackat index1(which is the second element), and prints the value of the removed element. - Line 55–57: It prints the remaining elements in the stack using a
forloop. Theget()method is used to access the elements in thestackby their index, and they are printed using theprint()method.
FastArrayStack: An optimized ArrayStack
Much of the work done by an ArrayStack involves shifting (by add(i, x) and remove(i)) and copying (by resize()) of data. In the implementations shown above, this was done using for loops. It turns out that many programming environments have specific functions that are very efficient at copying and moving blocks of data. In the C programming language, there are the memcpy(d, s, n) and memmove(d, s, n) functions. In the C++ language there is the std :: copy(a0, a1, b) algorithm. In Java there is the System.arraycopy(s, i, d, j, n) method.
These functions are usually highly optimized and may even use special machine instructions that can do this copying much faster than we could by using a for loop. Although using these functions does not asymptotically decrease the running times, it can still be a worthwhile optimization.
In our C++ and Java implementations, the use of fast array copying functions resulted in speedups of a factor between 2 and 3, depending on the types of operations performed. Your mileage may vary.