Building Scalable Data Pipelines with Kafka/

...

Producer Configurations

This lesson covers important configuration parameters that affect the behavior of the Kafka producer.

We'll cover the following...

There are several knobs and levers that can affect the behavior and performance of producers. All of them are documented in the Apache Kafka documentation and come with sensible defaults. In this lesson we’ll discuss the most important ones that can significantly impact memory usage, reliable delivery, and performance.

ack

The ack parameter specifies the number of replicas of a partition that must receive a message before a producer can consider the write successful. We’ll study this concept in greater depth in later chapters. There are only three values that ack can take on:

ack=0: Setting ack equal to zero implies the producer doesn’t wait to hear back from the Kafka cluster and assumes each message has been sent successfully. Obviously, this can lead to lost messages but the strategy achieves the highest throughput.
ack=1: In this setting, the producer receives a confirmation once the leader replica receives the message. If the leader crashes and a new leader has not yet been elected, an error is returned to the producer which can retry sending the message. However, the message can still get lost if the leader crashes and a replica is elected as the new leader that has not received the message (known as unclean election). In this setting, the throughput is determined whether the messages are sent synchronously or asynchronously. In the latter case, the throughput is capped by the number of in-flight messages (messages that have been sent but which haven’t yet been received).
ack=all: This setting

...