- BigQuery Export
Explore the process of exporting query results from BigQuery to Google Cloud Storage and how to prepare datasets to work seamlessly with PySpark in building scalable batch model pipelines. Understand how to sample, create tables, and export data in Avro format for efficient processing.
We'll cover the following...
We'll cover the following...
Exporting the results
The first step we’ll perform is exporting the results of a BigQuery query to GCS, which can be performed manually using the BigQuery UI.
This is possible to perform directly in Spark, but as I mentioned, the setup is quite involved to configure using the current version of the connector library. We’ll use the natality dataset for ...