Search⌘ K
AI Features

GCP Model Pipeline

Explore how to build scalable batch model pipelines in Google Cloud Platform using PySpark. Understand workflows involving BigQuery as a data lake, exporting data to Cloud Storage, and managing outputs with Cloud Datastore. Gain practical knowledge of overcoming integration challenges and structuring pipelines for production environments.

We'll cover the following...

BigQuery and Spark

A common workflow for batch model pipelines is reading input data from a lake, applying a machine learning model, and then writing the results to an application database.

In GCP, BigQuery serves as the data lake and Cloud Datastore can serve as an application database. We’ll build an ...