Search⌘ K
AI Features

Array Implementation of Stack

Explore the implementation of stacks using arrays in Python, focusing on the ArrayStack data structure. Understand core operations such as get, set, add, and remove, along with the amortized resizing technique that ensures efficient performance over multiple operations.

ArrayStack overview

The List and Queue interfaces can be implemented so that the underlying data is stored in an array, called the backing array. The following table summarizes the running times of operations for these data structures:

Running times of operations for array-based stack
Running times of operations for array-based stack

Advantages and limitations

Data structures that work by storing data in a single array have many advantages and limitations in common:

  • Arrays offer constant time access to any value in the array. This is what allows get(i) and set(i, x) to run in constant time.

  • Arrays are not very dynamic. Adding or removing an element near the middle of a list means that a large number of elements in the array need to be shifted to make room for the newly added element or to fill in the gap created by the deleted element. This is why the operations add(i, x) and remove(i) have running times that depend on nn and ii.

  • Arrays cannot expand or shrink. When the number of elements in the data structure exceeds the size of the backing array, a new array needs to be allocated and the data from the old array needs to be copied into the new array. This is an expensive operation.

The third point is important. The running times cited in the table above do not include the cost associated with growing and shrinking the backing array. We will see that, if carefully managed, the cost of growing and shrinking the backing array does not add much to the cost of an average operation. More precisely, if we start with an empty data structure, and perform any sequence of mm add(i, x) or remove(i) operations, then the total cost of growing and shrinking the backing array, over the entire sequence of m operations is O(m)O(m). Although some individual operations are more expensive, the amortized cost, when amortized over all mm operations, is only O(1)O(1) per operation.

ArrayStack: fast stack operations using an array

An ArrayStack implements the list interface using an array aa, called the backing array. The list element with index ii is stored in a[i]. At most times, aa is larger than strictly necessary, so an integer nn is used to keep track of the number of elements actually stored in aa. In this way, the list elements are stored in a[0],...,a[n1]a[0],. . . ,a[n − 1] and, at all times, a.lengthn\text{a.length} \ge n.

Python 3.10.4
class ArrayStack(BaseList):
def __init__(self, iterable=[]):
self._initialize()
self.add_all(iterable)
def _initialize(self):
self.a = new_array(1)
self.n = 0

The basics

Accessing and modifying the elements of an ArrayStack using get(i) and set(i, x) is trivial. After performing any necessary bounds-checking, we simply return or set, respectively, a[i].

Python 3.10.4
class ArrayStack(BaseList):
def __init__(self, iterable=[]):
self._initialize()
self.add_all(iterable)
def _initialize(self):
self.a = new_array(1)
self.n = 0
def get(self, i):
if i < 0 or i >= self.n: raise IndexError()
return self.a[i]
def set(self, i, x):
if i < 0 or i >= self.n: raise IndexError()
y = self.a[i]
self.a[i] = x
return y

The operations of adding and removing elements from an ArrayStack are illustrated below. A sequence of add(i, x) and remove(i) operations on an ArrayStack. Arrows denote elements being copied. Operations that result in a call to resize() are marked with an asterisk.

To implement the add(i, x) operation, we first check if aa is already full. If so, we call the method resize() to increase the size of aa. How resize() is implemented will be discussed later. For now, it is sufficient to know that, after a call to resize(), we can be sure that a.length > n. With this out of the way, we now shift the elements a[i],...,a[n1]a[i], . . . , a[n − 1] right by one position to make room for xx, set a[i] equal to xx, and increment nn.

Python 3.10.4
class ArrayStack(BaseList):
def __init__(self, iterable=[]):
self._initialize()
self.add_all(iterable)
def add(self, i, x):
if i < 0 or i > self.n: raise IndexError()
if self.n == len(self.a): self._resize()
self.a[i+1:self.n+1] = self.a[i:self.n]
self.a[i] = x
self.n += 1

If we ignore the cost of the potential call to resize(), then the cost of the add(i, x) operation is proportional to the number of elements we have to shift to make room for xx. Therefore the cost of this operation (ignoring the cost of resizing aa) is O(ni)O(n − i).

Implementing the remove(i) operation is similar. We shift the elements a[i+1],...,a[n1]a[i+1],...,a[n−1] left by one position (overwriting a[i]) and decrease the value of nn. After doing this, we check if nn is getting much smaller than a.length by checking if a.length3n\text{a.length} \ge 3n. If so, then we call resize() to reduce the size of aa.

Python 3.10.4
class ArrayStack(BaseList):
def __init__(self, iterable=[]):
self._initialize()
self.add_all(iterable)
def remove(self, i):
if i < 0 or i >= self.n: raise IndexError()
x = self.a[i]
self.a[i:self.n-1] = self.a[i+1:self.n]
self.n -= 1
if len(self.a) >= 3*self.n: self._resize()
return x

If we ignore the cost of the resize() method, the cost of a remove(i) operation is proportional to the number of elements we shift, which is O(ni)O(n − i).

Growing and shrinking

The resize() method is fairly straightforward; it allocates a new array b whose size is 2n2n and copies the nn elements of aa into the first nn positions in bb, and then sets aa to bb. Thus, after aa call to resize(), a.length=2n\text{a.length} = 2n.

Python 3.10.4
class ArrayStack(BaseList):
def __init__(self, iterable=[]):
self._initialize()
self.add_all(iterable)
def _resize(self):
b = new_array(max(1, 2*self.n))
b[0:self.n] = self.a[0:self.n]
self.a = b

Analyzing the actual cost of the resize() operation is easy. It allocates an array bb of size 2n2n and copies the nn elements of aa into bb. This takes O(n)O(n) time.

The running time analysis from the previous section ignored the cost of calls to resize(). In this section we analyze this cost using a technique known as amortized analysis. This technique does not try to determine the cost of resizing during each individual add(i, x) and remove(i) operation. Instead, it considers the cost of all calls to resize() during a sequence of mm calls to add(i, x) or remove(i). In particular, we will show:

Lemma: If an empty ArrayStack is created and any sequence of m1m \ge 1 calls to add(i, x) and remove(i) are performed, then the total time spent during all calls to resize() is O(m).O(m).

Proof: We will show that any time resize() is called, the number of calls to add or remove since the last call to resize() is at least n/21n/2−1. Therefore, if nin_i denotes the value of nn during the iith call to resize() and rr denotes the number of calls to resize(), then the total number of calls to add(i, x) or remove(i) is at least

i=1r(ni/21)m\sum_{i=1}^{r} (n_i/2-1) \leq m

which is equivalent to

i=1rni2m+2r\sum_{i=1}^{r} n_i \leq 2m + 2r

On the other hand, the total time spent during all calls to resize() is

i=1rO(ni)O(m+r)=O(m)\sum_{i=1}^{r} O(n_i) \leq O(m+r) = O(m)

since rr is not more than mm. All that remains is to show that the number of calls to add(i, x) or remove(i) between the (i1)(i − 1)th and the iith call to resize() is at least ni/2n_i/2.

There are two cases to consider.

  • Case 1: In the first case, resize() is being called by add(i, x) because the backing array a is full, i.e., a.length=n=ni\text{a.length} = n = n_i. Consider the previous call to resize(): after this previous call, the size of aa was a.length\text{a.length}, but the number of elements stored in aa was at most a.length/2=ni/2\text{a.length}/2 = n_i/2. But now the number of elements stored in aa is ni=a.lengthn_i = \text{a.length}, so there must have been at least ni/2n_i / 2 calls to add(i, x) since the previous call to resize().

  • Case 2: The second case occurs when resize() is being called by remove(i) because a.length3n=3ni\text{a.length} \ge 3n = 3n_i. Again, after the previous call to resize() the number of elements stored in aa was at least a.length/21\text{a.length}/2 − 1.

Note: The -1 in the above formula accounts for the special case that occurs when n = 0 and a.length = 1.

Now there are nia.length/3n_i \le \text{a.length}/3 elements stored in aa. Therefore, the number of remove(i) operations since the last call to resize() is at least

Ra.length/21a.length/3=a.length/61=(a.length/3)/21=ni/21\begin{split} R & \geq \text{a.length}/2-1 - \text{a.length}/3 \\ & = \text{a.length}/6-1 \\ & = (\text{a.length}/3)/2-1 \\ & = n_i /2-1 \end{split}

In either case, the number of calls to add(i, x) or remove(i) that occur between the (i1)(i−1)th call to resize() and the iith call to resize() is at least ni/21n_i / 2 − 1, as required to complete the proof.

Analysis of ArrayStack

The following theorem summarizes the performance of an ArrayStack:

Theorem: An ArrayStack implements the List interface. Ignoring the cost of calls to resize(), an ArrayStack supports the operations

  • get(i) and set(i, x) in O(1)O(1) time per operation; and
  • add(i, x) and remove(i) in O(1+ni)O(1+n−i) time per operation.

Furthermore, beginning with an empty ArrayStack and performing any sequence of mm add(i, x) and remove(i) operations results in a total of O(m)O(m) time spent during all calls to resize().

The ArrayStack is an efficient way to implement a Stack. In particular, we can implement push(x) as add(n, x) and pop() as remove(n − 1), in which case these operations will run in O(1)O(1) amortized time.

Python 3.10.4
from utils import new_array
from base import BaseList
class ArrayStack(BaseList):
def __init__(self, iterable=[]):
self._initialize()
self.add_all(iterable)
# initialize function
def _initialize(self):
self.a = new_array(1)
self.n = 0
# function to get the value
def get(self, i):
if i < 0 or i >= self.n: raise IndexError()
return self.a[i]
# function to set the value
def set(self, i, x):
if i < 0 or i >= self.n: raise IndexError()
y = self.a[i]
self.a[i] = x
return y
# function to add value in the arraystack
def add(self, i, x):
if i < 0 or i > self.n: raise IndexError()
if self.n == len(self.a): self._resize()
self.a[i+1:self.n+1] = self.a[i:self.n]
self.a[i] = x
self.n += 1
# function to remove value from the arraystack
def remove(self, i):
if i < 0 or i >= self.n: raise IndexError()
x = self.a[i]
self.a[i:self.n-1] = self.a[i+1:self.n]
self.n -= 1
if len(self.a) >= 3*self.n: self._resize()
return x
# resize method to adjust arraystack
def _resize(self):
b = new_array(max(1, 2*self.n))
b[0:self.n] = self.a[0:self.n]
self.a = b
if __name__ == "__main__":
# Object creation
stack = ArrayStack()
# Adding values
stack.add(0, 1)
stack.add(1, 2)
stack.add(2, 3)
print("Stack size: ", stack.size())
# Removing value
removed_element = stack.remove(1)
print("Removed element: ", removed_element)
# Updated stack
print("Stack elements:")
for i in range(stack.size()):
print(stack.get(i))

Explanation

Let’s start our explanation from the start of main() in main.py file.

  • Line 45: It creates a new ArrayStack of integers stack, which will be used to store elements.
  • Lines 47–49: It adds three elements to stack. The first argument to the add() method is the index at which to add the element, and the second argument is the value of the element.
  • Line 50: It prints the current size of the stack using the size() method.
  • Line 52–53: It removes an element from the stack at index 1 (which is the second element), and prints the value of the removed element.
  • Line 55–57: It prints the remaining elements in the stack using a for loop. The get() method is used to access the elements in the stack by their index, and they are printed using the print() method.

FastArrayStack: An optimized ArrayStack

Much of the work done by an ArrayStack involves shifting (by add(i, x) and remove(i)) and copying (by resize()) of data. In the implementations shown above, this was done using for loops. It turns out that many programming environments have specific functions that are very efficient at copying and moving blocks of data. In the C programming language, there are the memcpy(d, s, n) and memmove(d, s, n) functions. In the C++ language there is the std :: copy(a0, a1, b) algorithm. In Java there is the System.arraycopy(s, i, d, j, n) method.

These functions are usually highly optimized and may even use special machine instructions that can do this copying much faster than we could by using a for loop. Although using these functions does not asymptotically decrease the running times, it can still be a worthwhile optimization.

In our C++ and Java implementations, the use of fast array copying functions resulted in speedups of a factor between 2 and 3, depending on the types of operations performed. Your mileage may vary.