Data Storage
Explore Kafka's data storage mechanisms, including segment-based file management and configurable retention settings by time or size. Understand Kafka's message format, compression benefits, indexing for quick offset retrieval, and how compaction retains only the latest message per key. Discover tombstone messages used for permanent key deletions and the behavior of active segments within Kafka partitions.
We'll cover the following...
Data retention
Kafka doesn’t hold data in perpetuity. The admin can configure Kafka to delete the messages for a topic in two ways:
-
Specify a retention time after which messages are deleted.
-
Specify the data size to be reached before messages are deleted.
In either scenario, Kafka will not wait for consumers to read messages and delete them when the deletion criteria is met. Data for a partition isn’t a contiguous file. Rather, the data is broken into chunks of files called segments. Each segment can be at most 1GB in size or contain a ...