Introduction to Streaming Model Workflows
Explore how streaming platforms facilitate real-time data pipelines by connecting components and applying machine learning models in production. Gain hands-on experience with Apache Kafka, PySpark streaming, and cloud-managed services such as GCP PubSub and Cloud Dataflow to build scalable streaming model workflows.
Streaming platforms
Many organizations are now using streaming platforms in order to build real-time data pipelines that transform streams of data and move data between different components in a cloud environment. These platforms are typically distributed and provide fault-tolerance for streaming data. In addition to connecting different systems together, these tools also provide the ability to store records and create event queues.
Apache Kafka
One of the most popular streaming platforms is Apache Kafka, which is an open-source solution for providing message streaming across public and private clouds. Kafka is a hosted solution that requires provisioning and managing a cluster of machines in order to scale.