Log-Structured Merge-Tree
Understand how the Log-Structured Merge Tree (LSM) supports write-heavy database workloads by using components like Memtable, SSTable, WAL, and Bloom filters. Learn how LSM handles data writes, reads, compaction, and deletions to maintain performance and durability in distributed storage systems.
We'll cover the following...
Introduction
The Log Structured Merge Tree (LSM) is a disk-resident data structure to persist key-value pairs optimized for write-heavy query patterns.
LSM includes an append-only storage structure as the primary data structure to handle high write volumes. These storage structures are periodically merged and compacted in the background to eliminate duplicate and deleted records. LSM is a prominent data structure used in multiple NoSQL datastores such as Cassandra and Hbase.
Data structure
LSM includes four main data structures:
Memtable
SSTable
WAL
Bloom filter
Memtable
Memtable is an in-memory data structure that acts as the frontier for all the client requests handling reads and writes. Memtable implements a balanced binary search tree such as AVL tree, guaranteeing that the time complexity of insert, delete, update, and read requests be