Why Vector (In C++) And Arraylist (In Java) Are So Fast

Why Vector (In C++) And Arraylist (In Java) Are So Fast

17 mins read
Oct 27, 2025
Share
Content
The need for growable arrays
1. Implementing a generalized growable array
Explanation
Time complexity analysis
Amortized (average) cost
Why is the insertion cost of O(N) too slow and impractical?
2. Vector implementation of a growable array
The doubling technique
How good is vector insertion (time complexity analysis)?
Cost of N insertions
Amortized (average) cost of vector insertions
How fast is a vector as compared to a growable array?
Why vector in C++ is so fast on modern CPUs
Practical tips to keep vector fast in your code
Limitations of vectors
When a different container beats vector
How ArrayList in Java is different from C++ STL::vector
Applications of vectors

In programming, the need for dynamically resizable arrays is ubiquitous. Growable arrays have long been used to address this requirement, but with the advent of vector, a powerful container class provided by the Standard Template Library (STL), and ArrayList (in the Java Collections Framework), developers have gained access to a highly optimized and efficient solution.

In this blog, we will explore the motivation behind using growable arrays named the GrowAbleArray class (for this blog only), delve into its implementation (specifically the insertion at the end part), and analyze its time complexity. Then, our focus will shift to our own implementation of a Vector class, similar to vector; we'll discuss its insertion at the end part, and analyze its time complexity. We will also highlight the distinguishing factors in ArrayList. Finally, we will compare GrowAbleArray and Vector to understand why vectors are preferred over traditional growable arrays, considering their strengths, weaknesses, performance, and memory management strategies in modern programming scenarios.

The need for growable arrays#

Static arrays, while efficient and powerful, come with a limitation—they can't be resized. This constraint arises from the need to define variables and arrays at compile time, where their sizes are predetermined on the stack. Although this approach enables precise memory allocation and efficient runtime execution, it poses challenges when dealing with data that dynamically expands.

Enter dynamic arrays—a solution that empowers us to overcome the limitations of static arrays. By allowing us to allocate memory at runtime (on heap), dynamic arrays provide the flexibility to handle growing data seamlessly. With this powerful concept, we can tackle programming tasks involving massive data manipulations where the size of the data is dynamic.

canvasAnimation-image
1 / 6

In the above animation, you can see how we first created a dynamic array of size three and then expanded it to size four by inserting a new value at the end of the array. Although this expansion involves creating a new memory of size+1, copying the previous data to the new memory, writing the new value at the end, and then relocating the old pointer to the base of the new memory gives the illusion that a new value is simply added at the end of the original data.

Let's make a customized generic template class by extending this idea.

1. Implementing a generalized growable array#

Now, let's proceed with the implementation of a growable array. In this example, we are creating a user-defined type of a growable array named GrowableArray. We are also overloading the subscript [] operator to provide similar behavior to a primitive array. Additionally, we will implement the operator<< to facilitate the ease of printing our GrowableArray instances.

C++Implementation
JavaImplementation
#include <iostream>
using namespace std;
template <typename T>
class GrowableArray
{
T * A;
int size;
public:
GrowableArray()
{
size = 0;
A = nullptr;
}
void push_back(T v)
{
T * HA = new T [size + 1];
for(int i = 0; i < size; i++)
{
HA[i] = A[i];
}
HA[size] = v;
if(size!=0)
delete [] A; // As deleting a nullptr in some compiler is not safe
A = HA;
HA = nullptr;
size++;
}
T & operator [](int i)
{
if(i>=size) throw "Out of Boundary access";
return A[i]; // This is dangerous if i is out of bound
}
friend ostream & operator <<(ostream & out, const GrowableArray<T> & V)
{ // This requires the data values must have << operator overloaded
out << "{ ";
for(int i = 0; i < V.size; i++)
{
out<< V.A[i] << " ";
}
out<< "}";
return out;
}
~GrowableArray()
{
delete [] A;
}
};
int main()
{
// your code goes here
GrowableArray<int> G;
G.push_back(1);
G.push_back(2);
G.push_back(3);
cout << "G: " << G << endl;
return 0;
}

Let's discuss what we have done in the above code.

Explanation#

C++ implementation

We have created a user-defined type GrowableArray by using the properties of a dynamic array. In the GrowableArray class, we define two data members:

  • Line 6: A pointer A of template type T to enable the user to make a growable array of any type they like.

  • Line 7: A variable size of type int to hold the track of how much of a memory is allocated on heap.

  • Lines 9–13: We define a constructor GrowableArray() to make sure that all the data members must be in a valid state.

  • Lines 14–27: We define the method push_back(int v) to handle the insertion of elements into the array. This method increases the size of the array by one. It does this by creating a new heap array of size size+1 , copying all the elements from the previous memory into the new array, and saving the new value in the additional available space. Subsequently, the method deletes the previously allocated memory, and the pointer is relocated to the new memory, exactly as demonstrated in the example of expanding a growable array from size three to size four

Java implementation

Run the code of the GrowableArray.java file.

  • It has a similar implementation. The push_back() functions repeat a similar expanding factor. Similar to operator<<, this implementation uses the toString() function, which is used to print the Java GrowableArray.

Let’s have a look at the time complexity of these growable arrays (implementations of both C++ and Java).

Time complexity analysis#

Let’s assume we have to load NN records into the growable array. How much time will it take?

  • If we insert the 1st record, it will take 1 copying.

  • If we insert the 2nd record, it will take 2 copying (1 for the previous record relocated and 1 for the new value).

  • Inserting the 3rd record will take 3 copying (2 for the previous records relocated and 1 for the new value).

  • If we insert the NthN^{th} record, it will take NN copying (N1N-1 for the previous records relocated and 1 for the new value).

So, that means the cost of inserting NN records is:

Let's understand the solution of this arithmetic series using the given illustration:

Arithmetic series 1-100
1 / 2
Arithmetic series 1-100

Therefore, the total cost, as illustrated in the animation above, is approximately N22+N2N2\frac{N^2}{2}+\frac{N}{2}\approx N^2. In computer science, we denote this as O(N2)O(N^2), pronounced as “Big O of N squared.” In Big-O notation, we consider only the highest degree term of the polynomial, disregarding the constants associated with it and all smaller degree terms.

Amortized (average) cost#

On average, if we say that in a growable array we have inserted NN values, then the total average cost will be O(N2/N)=O(N)O(N^2/N) =O(N) approximately, meaning on almost every new entry, we are spending approximately O(N)O(N) copying. That is too slow.

Why is the insertion cost of O(N) too slow and impractical?#

Let’s imagine we're loading 10 million integers (in a binary file containing 44 bytes for each integer, i.e., 10×106×4=40MB\approx 10\times 10^{6} \times 4 = 40MB file size). Let’s say we run the above GrowableArray implementation on a machine of, say, a 10GHz10GHz processor. One iteration of the loop where the copying happens from lines 17–20 (in GA.h) will take several clock ticks, but let’s assume that for the sake of an example, if one iteration executes in 1-clock tick, then 40MB40MB of file loading will take around (10×106)2(10\times 10^6)^2 insertion cost; therefore it will take around 10141010\frac{10^{14}}{10^{10}} seconds that is around 1000010000 seconds (around 2.772.77 hours). Similarly, if the data is around 1GB1GB, then the time will be around 10181010\frac{10^{18}}{10^{10}} seconds, which is around 27777.777 hours, which is equal to around 3.173.17 years.

Take a pause and read this: “Only 1GB1GB of data and total data loading time is 3.173.17 years.”

This is an unbearable wait. Now imagine if you are working on Facebook, Amazon, or Google records, where you have to work on petabytes of data that will be impossible to manage if just loading (not processing) takes such a huge amount of time.

2. Vector implementation of a growable array#

The implementation of vectors closely resembles that of a growable array, but with a slight difference. The main concern arises during insertion in the growable array, where the need to relocate previously inserted records leads to recopying. This prompts us to question how we can minimize this recopying process.

What if, instead of increasing every time by one and recopying everything previously inserted to new space, we increase the size by a larger number, let’s say TT. Then, for the next T1T-1 insertions, the cost of insertion will be O(1)O(1) constant, until the memory gets filled completely. This is the main idea behind a vector in C++. Let's look into more details of the vector implementation, how the expansion (regrowing) happens, and what advantage it gives.

The doubling technique#

The concept behind this implementation involves two crucial variables: capacity and size. The capacity represents the actual memory space occupied on the heap, while the size indicates the number of records inserted by the user in the array. The initial capacity can be set to either 1 or any positive size specified by the user.

During insertion, if there is remaining capacity (i.e., size < capacity), the new value can be seamlessly added to the index of size (and size gets incremented accordingly). However, when size equals capacity, the vector expands not by just one but by twice the previous capacity. While copying still occurs during this expansion, the subsequent half of the capacity values (approximately) will have a constant insertion cost of O(1).

Before delving into the efficiency analysis of this approach, let’s proceed with its implementation. The only change required is in lines 18–27.

C++
#include <iostream>
using namespace std;
template <typename T>
class Vector
{
T * A;
int size;
int capacity;
public:
Vector(int _size = 1)
{
size = 0;
capacity = _size;
A = new T[_size];
}
void push_back(T v)
{
if(capacity == size)
{
capacity *= 2;
T * HA = new T [capacity];
for(int i = 0; i < size; i++)
{
HA[i] = A[i];
}
delete [] A; A = HA; HA = nullptr;
}
A[size] = v;
size++;
}
T & operator [](int i)
{
return A[i];
}
friend ostream & operator <<(ostream & out, const Vector<T> & V)
{
cout << "{";
for(int i = 0; i < V.size; i++)
{
cout<< V.A[i] << " ";
}
cout<< "}";
return cout;
}
};

Instruction: Run the above code in the main.cpp (you may change the values of the loop to test several different values). In our sample test run, it is loading 4040 million integers in only 6.386.38 seconds.

Here in the animation, you can see how the insertion happens and capacity and size manage the insertion operation. The actual cost is associated with the copying happening on the line 24 and line 28 (in Vector.h).

Inserting 4
1 / 10
Inserting 4

How good is vector insertion (time complexity analysis)?#

As we see in the above slides, due to the exponential increase in capacity (doubling by two), most of the time, the insertion cost will shrink to the copying cost of O(1)O(1). There are only rare events where the resizing (regrowing) of the vector happens.

Think for a moment about what those special positions are.

The total cost of NN operations looks like this seemingly bizarre pattern:

Cost of N insertions#

We can do a small trick, i.e., every single insertion cost is at least 1; by separating that 1 cost of each operation, the new summation will become:

    N+(11)+(21)+(31)+(11)+(51)+(11)+(11)+(11)+(91)+(11)+...+(N1)\implies N + (1-1)+(2-1)+(3-1)+(1-1)+(5-1)+(1-1)+(1-1)+(1-1)+(9-1)+(1-1)+...+(N-1)\\

    N+0+1+2+0+4+0+0+0+8+0+...+(N1)\implies N + 0+1+2+0+4+0+0+0+8+0+...+(N-1)\\

    N+1+2+4+8+...+(N1)\implies N + 1+2+4+8+...+(N-1)\\

The final expression has a geometric series (1+2+4+8+...+(N1)1+2+4+8+...+(N-1)). That summation is bounded by 2N2N; therefore, the total cost will be approximately: N+2N=3N\approx N+2N = 3N

Why 1+2+4+8+...+(N)<=2N1+2+4+8+...+(N)<=2N? The proof we are giving is considered a “proof without words.”In mathematics, a proof without words is an illustration of an identity or mathematical statement that can be demonstrated as self-evident by a diagram without any accompanying explanatory text. Such proofs can be considered more elegant than formal or mathematically rigorous proofs due to their self-evident nature.

As the summation is commutative, we can rewrite the summation as:

N+N/2+N/4+N/8+...+8+4+2+1N+N/2+N/4+N/8+...+8+4+2+1

We can take NN common: N(1+12+14+18++116+...)N(1+\frac{1}{2}+\frac{1}{4}+\frac{1}{8}++\frac{1}{16}+...)

As shown in the below diagram: 1+12+14+18+116+...21+\frac{1}{2}+\frac{1}{4}+\frac{1}{8}+\frac{1}{16}+... \leq 2.

The sum of an infinite geometric series starting from 1 with common ratio of 1/2
The sum of an infinite geometric series starting from 1 with common ratio of 1/2


Therefore, the total complexity will just be 3N3N.

Amortized (average) cost of vector insertions#

On average, if we say that in a vector array we have inserted NN values, then the total average cost will be O(3N/N)=O(1)O(3N/N) =O(1) approximately, meaning on almost every new entry we are spending approximately constant O(1)O(1) copying. That is an amazing improvement.

How fast is a vector as compared to a growable array?#

Now, let’s imagine that we would like to load 1GB1GB of data and assuming that one clock tick of a 10GHz10GHz processor executes one copying operation (the same assumption that we took in the above scenario of the growable array), now the total number of copying will just be 109×310^9 \times 3 that is lesser than the processor’s total speed of 101010^{10}; therefore the loading process will take approximately 109×31010\frac{10^9\times 3}{10^{10}} seconds that is less than a fraction of a second. This is a really big improvement as compared to the previous one, which was 3.173.17 years of loading time.

This is a clear difference between an algorithm taking O(N)O(N) time and O(N2)O(N^2); this difference gets extremely important when handling large data.

Why vector in C++ is so fast on modern CPUs#

If you’ve ever benchmarked real code and wondered why vector in C++ is so fast compared to other containers, the answer goes beyond amortized O(1) append. Vector’s design lines up perfectly with how modern CPUs fetch and process memory:

  • Contiguous storage = cache friendliness. Elements live in one contiguous buffer. Linear scans and index-based access become streaming reads with predictable strides. Hardware prefetchers detect these strides and pull the next cache lines early. Fewer cache and TLB misses often matter more than Big-O on today’s hardware, which is why vector iterations routinely beat linked structures even when theoretical complexity is similar.

  • Fewer allocations and less indirection. A vector object holds just a few pointers (begin/end/capacity). The data lives in one heap allocation, not one allocation per node. That means far fewer calls into the allocator, fewer page faults, and fewer pointer dereferences. In contrast, node-based containers (like a list) pay an allocation and a pointer chase per element.

  • Move semantics and emplace. Since C++11, push_back and emplace_back can construct elements in place or move them during reallocation instead of copying. For types with cheap moves, growth becomes much faster. For trivially copyable types, vector can use bulk memcpy/memmove, which many standard libraries route to highly optimized routines (often leveraging SIMD).

  • Simple indexing & branch prediction. Random access is just pointer arithmetic: *(begin + i). There’s no bounds check in operator[] (unless you choose at, which checks). Hot loops over vectors are easy for compilers to auto-vectorize and for branch predictors to “understand.” The end result is tight, predictable machine code.

  • Growth policy fits allocators well. Doubling (or ~1.5x) growth means allocator calls are rare. When they do happen, the allocator gets large requests that are efficient to satisfy, and then vector sticks with that memory for a long time.

Practical tips to keep vector fast in your code#

Here are concrete habits that preserve vector’s speed in production:

  • Use reserve when you can estimate the size. Pre-grow capacity to avoid intermediate reallocations. If you’ll push ~100k items, reserve(100000) removes the growth overhead entirely.

  • Prefer emplace_back over push_back(T{…}). Construct in place to skip temporary objects. If you already have an object, push_back(std::move(obj)) lets growth use moves.

  • Erase efficiently with the erase-remove idiom. For bulk removal based on a predicate, v.erase(std::remove_if(v.begin(), v.end(), pred), v.end()); is cache-friendly and linear.

  • Avoid frequent front/middle insertions. Inserts in the middle still shift elements. If you need frequent front pushes, consider deque. If you need stable references through inserts/erases, consider a list or a deque, depending on access patterns.

  • Be cautious with shrink_to_fit. It’s non-binding and can cause a reallocation; only call it when you truly want to give memory back and don’t plan to grow again soon.

  • Don’t over-optimize with small partitions. Many micro-benchmarks “prove” lists are faster because they manufacture worst-case vector behavior. In typical data-parallel loops or scans, vector wins due to locality.

Watch iterator invalidation. Reallocation invalidates all iterators/pointers/references; insertion/erase invalidates ranges after the point. Structure code so critical handles aren’t kept across operations that might grow or shift.

Limitations of vectors#

Vectors are a popular data structure and have many advantages, but they also come with some limitations:

  1. Contiguous memory: Vectors store elements in contiguous memory locations. While this provides faster access to elements, it also means that inserting or deleting elements from the middle of the vector requires shifting all subsequent elements, which can be inefficient for large vectors.

  2. Fixed capacity overhead: Despite being dynamic, vectors have a fixed capacity that grows as elements are added. If the capacity exceeds the required size, it may lead to wasted memory space.

  3. Reallocation overhead: As vectors grow and exceed their capacity, they may need to reallocate memory (though this is very rare), but still a specific event of insertion can be expensive.

  4. Limited performance for front insertions: Adding elements to the front of a vector requires shifting all existing elements, resulting in a time complexity of O(N)O(N), where NN is the number of elements in the vector.

  5. Cost of complex objects: For objects with complex copy constructors or large sizes, the resizing and copying operations can be resource-intensive.

While vectors are suitable for many use cases, it is essential to consider these limitations when deciding on the appropriate data structure for specific scenarios. In situations where real-time performance and frequent resizing are critical concerns, alternative data structures like linked lists, doubly-ended queues, or trees may be more appropriate choices.

When a different container beats vector#

Vector isn’t a silver bullet. Prefer other containers when a pattern defeats its strengths:

  • Frequent front insert/erase: deque stores blocks and can push/pop at both ends in O(1) without shifting central elements.

  • Stable pointers/references required across insert/erase: list preserves addresses of elements, at the cost of much poorer cache locality and more allocations.

  • Massive, unpredictable middle insertions: A balanced tree (std::map/std::set) or deque may be a better fit depending on iteration and lookup patterns.

  • Ring buffers / producer-consumer queues: Use deque or a dedicated ring buffer structure to avoid shifting.

  • Very large objects with expensive moves: Consider storing by value in vector of small handles (like unique_ptr<T>) to keep the contiguous array small and relocations cheap.

Choosing vector is about aligning the container with the dominant operation. If your code mostly iterates and appends, vector is hard to beat. If your code mostly inserts in the middle or needs stable addresses, acknowledge the trade-offs and pick accordingly.

How ArrayList in Java is different from C++ STL::vector#

The Java ArrayList class, part of the Java Collections Framework, offers an implementation similar to C++ STL’s vector. It distinguishes itself with a growing factor of 1.51.5 when resizing its internal array to accommodate more elements. This 1.51.5 factor balances memory usage and performance optimization, resulting in less memory wastage. However, this choice leads to more frequent reallocation and copying, causing a slight overhead during insertions compared to a factor of 22. Despite the trade-off, ArrayList remains a flexible and efficient dynamic array implementation, favored for managing lists of elements with dynamic sizes.

Java
import java.util.Arrays;
public class CustomArrayList<T> {
private Object[] A;
private int size;
private int capacity;
public CustomArrayList(int initialCapacity)
{
size = 0;
capacity = initialCapacity;
A = new Object[initialCapacity];
}
public CustomArrayList() {
this(1);
}
public void push_back(T v)
{
if (capacity == size)
{
capacity = (int) Math.ceil(capacity * 1.5);
Object[] HA = new Object[capacity];
for (int i = 0; i < size; i++)
{
HA[i] = A[i];
}
A = HA;
// System.out.println(capacity);
}
A[size] = v;
size++;
}
@SuppressWarnings("unchecked")
public T get(int i)
{
if (i >= size)
throw new IndexOutOfBoundsException("Out of Boundary access");
return (T) A[i];
}
@Override
public String toString() {
return Arrays.toString(A);
}
public static void main(String[] args)
{
CustomArrayList<Integer> G = new CustomArrayList<>(10);
for(int i=1;
i<=8*(1<<20); // Works with 8 Million times here
// i<=40*(1<<20); // This will cause in increase in allocated memory,
// which is not feasible for educative platform
// but you can test on your own, that it will take more time.
i++)
{
G.push_back(i);
}
System.out.println("G is created... ");
}
}

Applications of vectors#

Vectors find their utility in various scenarios, especially when data insertion at the end and random access are crucial requirements. It excels in situations where dynamic stacks and queues need to be implemented. By utilizing vector in the background, we ensure that the memory allocation can adapt to dynamic scenarios, providing flexibility to grow or shrink as needed.

Notably, we have focused on the insertion at the end operation and the time complexity analysis in this blog. However, vector in C++ offers several other functions like pop_back() and resize() that can further enhance its versatility. Additionally, there are numerous other applications of vectors beyond what we have covered here. If you are interested in exploring these functions and applications in depth, we recommend checking out our comprehensive Data Structures Course and Grokking Coding Interviews Patterns in C++. There, you can delve into more real-world use cases of vector / ArrayList and their applications.

Grokking Coding Interview Patterns in C++

Cover
Grokking the Coding Interview Patterns

With thousands of potential questions to account for, preparing for the coding interview can feel like an impossible challenge. Yet with a strategic approach, coding interview prep doesn’t have to take more than a few weeks. Stop drilling endless sets of practice problems, and prepare more efficiently by learning coding interview patterns. This course teaches you the underlying patterns behind common coding interview questions. By learning these essential patterns, you will be able to unpack and answer any problem the right way — just by assessing the problem statement. This approach was created by FAANG hiring managers to help you prepare for the typical rounds of interviews at major tech companies like Apple, Google, Meta, Microsoft, and Amazon. Before long, you will have the skills you need to unlock even the most challenging questions, grok the coding interview, and level up your career with confidence. This course is also available in JavaScript, Python, Go, and C++ — with more coming soon!

85hrs
Intermediate
450 Challenges
451 Quizzes

Data Structures for Coding Interviews in Java

Cover
Data Structures for Coding Interviews in Java

Data structures are amongst the fundamentals of Computer Science and an important decision in every program. Consequently, they are also largely categorized as a vital benchmark of computer science knowledge when it comes to industry interviews. This course contains a detailed review of all the common data structures and provides implementation level details in Java to allow readers to become well equipped. Now with more code solutions, lessons, and illustrations than ever, this is the course for you!

35hrs
Beginner
65 Challenges
22 Quizzes

Data Structures for Coding Interviews in C++

Cover
Data Structures for Coding Interviews in C++

Data structures are amongst the very fundamentals of Computer Science and are often a core decision in developing efficient programs. Consequently, they are also largely categorized as a vital benchmark of computer science knowledge when it comes to industry interviews. This course contains a detailed review of all the common data structures and provides implementation level details in C++ to allow readers to become well equipped with all the different data structures they can leverage to write better code!

25hrs
Beginner
65 Challenges
23 Quizzes

Written By:
Sarfraz Raza