Incremental Model with the insert_overwrite Strategy
Learn how to create incremental models with the insert_overwrite strategy.
Introduction to the insert_overwrite
strategy
Apart from the MERGE
strategy, incremental models using the BigQuery adapter can use another strategy: the insert_overwrite
strategy.
The need for another incremental strategy
Models with a MERGE
strategy have to rescan the whole table, which can be time- and cost-consuming for big tables. insert_overwrite
provides a solution to that issue because it does not rescan the whole destination table. Instead, it lets us partition our tables. This way, dbt only scans the relevant partitions, which is faster and more efficient.
The insert_overwrite
strategy is more complex. It’s recommended to use it only if models underperform with the MERGE
strategy.
Partitioned tables
In BigQuery, partitioned tables are a way to manage heavy tables by dividing them into smaller, more manageable segments based on a specific column called the partitioning column. This column’s values determine how the data is physically stored and queried.
Partitioning can significantly improve query performance and reduce costs by allowing BigQuery to scan and process only the relevant partitions instead of the entire table. A common use case of partitioning is time-based data, where we might partition by a timestamp column, enabling more efficient querying for specific time ranges.
To create a partitioned table in dbt, all we have to do is specify a partition_by
configuration in our model file like this:
Get hands-on with 1400+ tech skills courses.