Search⌘ K
AI Features

Python Models

Explore how to leverage Python models in dbt to perform complex data transformations that are difficult with SQL alone. Understand the necessary Google Cloud configurations, enable key APIs, and work hands-on with PySpark DataFrames and pandas integration to extend your dbt capabilities.

To use Python models with BigQuery, you’ll need to enable billing on your project.

You’ll then get credits to use before being charged.

Overview of Python models

dbt allows data engineers and analysts to use a common tool and language i.e., SQL. However, since data scientists primarily work with Python to create their machine learning models, dbt introduced a new feature: Python models. They come with some downsides:

  • They’re slower and more expensive than SQL models.

  • Python comes with many libraries and functions, which have a lot of capabilities but imply a steep learning curve.

  • This is still a new and experimental feature. There is not much documentation on Python models, and debugging can be difficult, especially because error messages are not always explicit.

  • Many APIs are used to run Python models with BigQuery: Dataproc, Cloud Storage, VPC networks, etc. A basic ...