Apache Flink is an open-source stream-processing framework developed by Carbone et al.P. Carbone, A. Katsifodimos, S. Ewen, V. Markl, S. Haridi, and K. Tzoumas, “Apache FlinkTM: Stream and Batch Processing in a Single Engine,” IEEE Data Engineering Bulletin 2015, 2015. at Apache Software Foundation to provide a high throughput, low latency data processing engine.

Note: This is the main differentiator between Flink and Spark. Flink processes incoming data as they arrive, which provides sub-second latency that can go down to single-digit millisecond latency. Spark also provides a streaming engine called Spark Streaming. However, that is running some form of micro-batch processing, where an input data stream is split into batches, which are then processed to generate the final results with the associated latency trade-off.

Get hands-on with 1200+ tech skills courses.