What is extendible hashing?
Extendible hashing is a dynamic approach to managing data. In this hashing method, flexibility is a crucial factor. This method caters to flexibility so that even the hashing function dynamically changes according to the situation and data type.
Algorithm
The following illustration represents the initial phases of our hashtable:
Directories and buckets are two key terms in this algorithm. Buckets are the holders of hashed data, while directories are the holders of pointers pointing towards these buckets. Each directory has a unique ID.
The following points explain how the algorithm work:
- Initialize the
and thebucket depths The number of times a bucket has overflown of the directories.global depth Max of the bucket depths - Convert data into a binary representation.
- Consider the "global depth" number of the
of data.least significant bits (LSBs) Rightmost bits of a binary number - Map the data according to the ID of a directory.
- Check for the following conditions if a bucket overflows (if the number of elements in a bucket exceeds the set limit):
- Global depth == bucket depth: Split the bucket into two and increment the global depth and the buckets' depth. Re-hash the elements that were present in the split bucket.
- Global depth > bucket depth: Split the bucket into two and increment the bucket depth only. Re-hash the elements that were present in the split bucket.
- Repeat the steps above for each element.
By implementing the steps above, it will be evident why this method is considered so flexible and dynamic.
Example
Let's take the following example to see how this hashing method works where:
- Data = {28,4,19,1,22,16,12,0,5,7}
- Bucket limit = 3
Convert the data into binary representation:
- 28 = 11100
- 4 = 00100
- 19 = 10011
- 1 = 00001
- 22 = 10110
- 16 = 10000
- 12 = 01100
- 0 = 00000
- 5 = 00101
- 7 = 00111
The following slideshow represents the remaining steps:
Analysis
Advantages | Disadvantages |
Less costly data retrieval. | Memory wastage due to certain buckets containing more data than the others. |
The dynamic approach avoids data loss. | Complicated coding. |
Free Resources