Collisions in Hash Tables
Explore why collisions in hash tables happen, the challenges they present, and the critical importance of handling them. Understand how collisions cause data loss and degrade performance, and why no hash function can fully prevent them in general applications.
We'll cover the following...
A hash table computes an index from a key and stores the value at that position in the backing slice. This approach fails when two different keys produce the same index. This situation is called a collision. This lesson explains its causes, its inevitability, and the importance of handling it correctly.
What is a collision?
A collision occurs when two different keys produce the same index after the hash function and modulo operation are applied.
For example, suppose your hash table has size 7. The key "listen" produces a hash value that maps to index 4. The key "silent" also produces a hash value that maps to index 4. Both keys are different, but they compete for the same slot. This is a collision.
Notice that "listen" and "silent" are anagrams of each other. If your hash function simply sums the character codes, both strings will always produce the same sum and therefore the same index. This is an example of a weakness in a naive hash function, but even sophisticated hash functions cannot eliminate collisions entirely.
Why collisions are unavoidable
Collisions ...