How to configure incremental models in DBT ?
What is an incremental model?
A common need in building models in data warehouse is to build models/tables incrementally and not truncate/load or drop/load every time. This is an effective strategy to build tables with optimized resource consumption. When building Incremental models as tables in your data warehouse — the first time a model is run, the table is built by transforming all rows of source data. On subsequent runs, dbt transforms only the rows in your source data that you tell dbt to filter for, inserting them into the table that has already been built (the target table).
Using an incremental model limits the amount of data that needs to be transformed. improving warehouse performance and reduces compute costs.
Using incremental materialization in DBT?
Built in materializations in dbt can be used to create incremental model. Models are created with select statement as usual and materialization can be defined in a config block.
There are few rules which needs to be considered when defining incremental model
- how to filter the rows on an incremental run.
- the uniqueness constraint of the model (if any). However, it is optional. When key is not defined, entire record is considered as key.
Incremental Model execution
Databases where “merge” statement is supported , it will be used to insert new records and update existing records.
However, databases where “merge” statements are not supported, a merge is implemented using (delete + insert) strategy i.e. first using a delete statement to delete records in the target table that are to be updated, and then an insert statement.
Additional support for creating incremental models ?
An optional incremental_strategy config controls the code that dbt uses to build incremental models. Different approaches may vary by effectiveness depending on the volume of data, the reliability of your unique_key, or the availability of certain features.