MLlib Batch Pipeline
Explore how to build scalable batch pipelines for predictive modeling using PySpark's MLlib. Learn the process from loading and transforming data to applying classification algorithms and saving model outputs in a cloud data lake.
We'll cover the following...
We'll cover the following...
Now that we’ve covered loading and transforming data with PySpark, we can use the machine learning libraries in PySpark to build a predictive model.
MLlib
The core library for building predictive models in PySpark is called MLlib. This library provides a suite of supervised ...