Building Scalable Data Pipelines with Kafka/

...

Consumer Configurations

This lesson discusses the most important configurations relating to a Kafka consumer.

We'll cover the following...

fetch.min.bytes
fetch.max.wait.ms
max.partition.fetch.bytes
session.timeout.ms
auto.offset.reset
enable.auto.commit
max.poll.records
client.id
receive.buffer.bytes and send.buffer.bytes
partition.assignment.strategy

`fetch.min.bytes`

This property specifies the minimum amount of data a broker should send back to a consumer. The broker will wait for messages to pile up until they are larger in aggregate size than fetch.min.bytes before sending them to the consumer. Setting a higher value for this setting reduces the back and forth between the consumer and the broker. The configuration can be set high when there are a large number of consumers or the consumers run CPU-intensive processing on the received data.

`fetch.max.wait.ms`

The configuration fetch.max.wait.ms specifies how long to wait if the minimum number of bytes to fetch specified by the previous configuration fetch.min.bytes aren’t available.

`max.partition.fetch.bytes`

The configuration max.partition.fetch.bytes specifies the maximum number of bytes returned by a broker per partition. Obviously, this value has to be higher than the largest message the broker will accept, otherwise the consumer can hang trying to read a message that is greater than max.partition.fetch.bytes but within acceptable limits for the broker. Setting too high a value may return more records than the consumer can process in a timely manner. This can be a problem as the poll() call is also responsible for sending out heartbeats, ...

Basics

Kafka Producer

Kafka Consumer

Kafka Internals

Conclusion

Appendix

Reference: Replication

Reference: Partitioning

Reference: Transactions

Reference: Issues in Distributed Systems

Consumer Configurations

`fetch.min.bytes`

`fetch.max.wait.ms`

`max.partition.fetch.bytes`