Working with ArrayList, HashSet, and HashMap
Explore how to use Java Collections Framework's core classes ArrayList HashSet and HashMap to handle dynamic data storage enforce uniqueness and manage key value associations. Understand when to choose each collection type and how their operations impact performance and code clarity.
Arrays are the foundational building blocks of data storage in Java, but they are rigid. You must define their size upfront, and that size cannot change. In the real world, data is dynamic: users sign up, shopping carts grow, and log files expand. We rarely know exactly how many items we will need to store when we write the code.
To manage varying data storage and retrieval needs, we use the Java Collections Framework. The framework offers many specialized tools, but for most use cases, you’ll rely on three core implementations: ArrayList for dynamic arrays, HashSet for unique collections, and HashMap for key-value pairs. Choosing the right implementation affects both the complexity of your code and its runtime performance.
The resizable array (ArrayList)
The ArrayList is the most widely used collection in Java. You can think of it as a smart array that automatically resizes itself as you add or remove elements. Like a standard array, it maintains a specific order (insertion order) and allows you to access elements instantly if you know their position (index).
To use an ArrayList, we must specify the type of data it will hold using generics the <Type> syntax. This enforces type safety, ensuring we do not accidentally store an Integer in a list intended for String values.
In the example below, we manage a dynamic list of todo items. Notice that we can add duplicates, and the list preserves the order in which we added them. Beyond just adding items, ArrayList provides methods to modify items at specific positions (set), check the list size (size), and clear all data (clear).
Line 6: We declare an
ArrayListthat can only holdStringobjects. The<>operator on the right side (the diamond operator) infers the type from the left.Lines 9–11: We use
add()to append items. TheArrayListexpands automatically to accommodate them.Line 14: We retrieve an item using
get(int index). This is an extremely fast operation (constant time, or). Line 17: We replace the element at index 1 using
set(). This overwrites the existing value (“Walk the dog”).Line 20:
size()returns the current number of elements (3), not the capacity of the backing array.
Performance note: ArrayList is optimized for reading data by index. However, if you need to find a specific element without knowing its index (e.g., “Does this list contain ‘Feed the cat’?” using contains()), the list must scan elements sequentially from the beginning. As the list grows, these searches become slower.
Ensuring uniqueness with HashSet
Sometimes, the order of data matters less than its uniqueness. If you are tracking a list of active user IDs or unique words in a document, you do not want duplicates. An ArrayList would require you to manually check for existence before every addition. The HashSet handles this automatically.
A HashSet represents a mathematical set. It enforces uniqueness and, unlike a list, it does not guarantee any specific order. If you print a HashSet, the elements may appear in a completely different order than you inserted them.
Line 5: We create a
HashSetrestricted toIntegerobjects.Line 10: We attempt to add
101again. Theadd()method returnsboolean:trueif the item was new and added,falseif it was already present. Here, it returnsfalse.Line 13: We use
contains()to check if a value exists.
Performance note: This is where HashSet shines. Thanks to a mechanism called hashing, checking contains() is nearly instantaneous, regardless of whether the set has ten items or ten million. If you strictly need to check for existence and do not care about order, HashSet is significantly faster than ArrayList.
Key-value associations with HashMap
We often need to associate one piece of data with another, such as looking up a phone number by a name or a product price by its SKU. HashMap allows us to store data as key-value pairs.
In a HashMap:
Keys must be unique (like a
HashSet).Values can be duplicated.
We retrieve values using the key, not an index.
Line 6: We declare a
HashMapwhere the key is aStringand the value is anInteger.Lines 9–11: We use
put()to store pairs. Note that “Alice” and “Charlie” have the same score (value), which is permitted.Line 14: We call
put()again with the existing key “Alice”. This updates her score from 50 to 60.Line 17: We retrieve the value associated with “Alice” using
get(). This is an efficient, constant-time lookup.Line 18: If we request a key that does not exist,
get()returnsnull.Line 21:
containsKey()efficiently checks if a key exists without retrieving the value.
Removing individual elements
Each collection handles removal differently. ArrayList is unique because it allows removal by either index or object, while HashSet and HashMap rely on the object or key.
ArrayList: It has tworemovemethods.remove(int index): Removes the item at that specific position.remove(Object o): Removes the first occurrence of that specific object.
HashSet:remove(Object o)finds and removes the item.HashMap:remove(Object key)removes the key and its value.
In this example, we will remove items from each collection and verify the removal by printing the updated collection or checking for existence.
Line 22:
languages.remove(1)removes the element at index 1 (“Python”).Line 23:
languages.remove("Java")searches for the string “Java” and removes the first match.Line 25:
nums.remove(99)finds the value 99 in the set and removes it.Line 27:
config.remove("Version")removes the entry where the key is “Version”.Lines 30–32: We print the collections and use
containsandcontainsKeyto confirm the items are gone.
Bulk operations
Sometimes you need to manipulate the entire collection at once rather than one item at a time. The Collections Framework provides standard methods for this:
addAll(Collection c): Adds all elements from another collection.clear(): Removes all elements, resetting the size to 0.isEmpty(): Returnstrueif the collection has no elements.
Line 10: We create a temporary list of integers using
List.of.Line 11:
nums.addAll(newNums)adds 2, 3, and 4 to theHashSetin a single operation. Duplicate values are ignored.Line 16:
clear()removes every element from the set.Line 19:
isEmpty()confirms the set size is now 0.
Choosing the right collection
Selecting the correct collection is a trade-off between how you need to store the data (ordered vs. unique) and how you need to access it (by index vs. by search).
Feature |
|
|
|
Primary Use | Ordered list | Unique groups | Key-value lookups |
Access Style | By index (0, 1, ...) | By object | By key |
Duplicates? | Allowed | Not allowed | Allows for values only |
Ordered? | Yes (insertion order) | No guarantee | No guarantee |
Performance | Fast access by index | Fast existence check | Fast lookup by key |
Use an ArrayList when position matters or when you just need a simple container. Use HashSet when you must prevent duplicates. Use HashMap when you need to look up information based on a unique identifier.
ArrayList, HashSet, and HashMap are the most commonly used general-purpose collection implementations in Java. Although the Collections Framework includes other implementations, such as LinkedList and TreeMap, ArrayList, HashSet, and HashMap are typically preferred because they offer predictable performance and straightforward APIs for most general-purpose scenarios. Understanding how their add, get, and remove operations behave in terms of time complexity and constraints is essential for choosing the right collection and writing efficient Java code.