Transactions, Storage Layout, and other Guarantees
Let’s have an overview of the transactions and the physical storage of Kafka, and the provided guarantees by it.
Transactional client
Kafka provides a transactional client that allows producers to produce messages to multiple partitions of a topic atomically.
A transactional client also makes it possible to commit consumer offsets from a source topic in Kafka and produces messages to a destination topic in Kafka atomically. This makes it possible to provide exactly-once guarantees for an end-to-end pipeline. This is achieved through the use of a two-phase commit protocol, where the brokers of the cluster play the role of the transaction coordinator in a highly available manner using the same underlying mechanisms for partitioning, leader election, and fault-tolerant replication.
The coordinator stores the status of a transaction in a separate log. The messages contained in a transaction are stored in their own partitions as usual.
When a transaction is committed, the coordinator is responsible for writing a commit marker to the partitions containing messages of the transactions and the partitions storing the consumer offsets.
Consumers can also specify the isolation level they want to read under, read_committed or read_uncommitted. In the former case, messages that are part of a transaction will be readable from a partition only after a commit marker has been produced for the associated transaction. This interaction is summarised in the following illustration:
Physical storage of Kafka
The physical storage layout of Kafka is simple and it is shown in the following illustration. Every log partition is implemented as a set of segment files of approximately the same size (e.g., 1 GB).
Every time a producer publishes a message to a partition, the broker appends the message to the last segment file. For better performance, segment files are flushed to disk only after a configurable number of messages have been published or a configurable amount of time has elapsed.
Note: ...