This is a lookup table, not a tutorial. For the when-to-use-what framework, see the materializations concepts page. For a worked example, see the incremental guide.
view
{{ config(
materialized='view',
tags=['staging'],
docs={'node_color': '#3a6e96'}
) }}
| Option | Default | Accepted values |
|---|---|---|
materialized | required | view |
tags | [] | list of strings |
docs.node_color | None | hex color for dbt docs |
alias | None | override the database-side name |
schema | profile default | string |
database (catalog) | profile default | string |
table
{{ config(
materialized='table',
file_format='delta',
liquid_clustered_by=['customer_id', 'event_date'],
tblproperties={'delta.autoOptimize.optimizeWrite': 'true'},
databricks_compute='heavy'
) }}
| Option | Default | Accepted values |
|---|---|---|
materialized | required | table |
file_format | delta | delta, parquet, hudi (delta preferred) |
liquid_clustered_by | None | list of column names |
auto_liquid_cluster | False | bool; Databricks picks keys |
partition_by | None | list of columns; use only for hard isolation |
tblproperties | {} | dict of Delta table properties |
databricks_compute | default | name from profiles.yml databricks_compute map |
location_root | None | external location for managed tables |
include_full_name_in_path | False | bool; include catalog/schema in path |
Warning
Never combine liquid_clustered_by with auto_liquid_cluster. Pick one. With both set, dbt raises a config error at parse time, but if it slipped through an older version you get non-deterministic clustering.
incremental
{{ config(
materialized='incremental',
incremental_strategy='merge',
unique_key='order_id',
on_schema_change='append_new_columns',
incremental_predicates=["DBT_INTERNAL_DEST.updated_at >= current_date - interval 7 days"],
liquid_clustered_by=['order_id'],
merge_exclude_columns=['_dbt_copied_at']
) }}
| Option | Default | Accepted values |
|---|---|---|
materialized | required | incremental |
incremental_strategy | merge | merge, append, insert_overwrite, replace_where, microbatch, delete+insert (1.11+) |
unique_key | required for merge | column name or list |
on_schema_change | ignore | ignore, fail, append_new_columns, sync_all_columns |
incremental_predicates | None | list of SQL predicates to narrow the merge scan |
partition_by | None | list (required for insert_overwrite) |
merge_exclude_columns | None | list; columns that should not be updated on merge |
merge_update_columns | None | list; whitelist of columns to update |
liquid_clustered_by | None | list of columns |
tblproperties | {} | dict |
databricks_compute | default | compute name |
Strategy compatibility
| Strategy | Requires unique_key | Works with partition_by | Delta-only | Notes |
|---|---|---|---|---|
merge | yes | yes (optimizes merge) | yes | Most flexible, most expensive |
append | no | yes | no (works on Parquet/Hudi too) | Cheapest; no uniqueness guarantee |
insert_overwrite | no | yes (required) | no | Rewrites touched partitions |
replace_where | no | no | yes | Uses incremental_predicates for scope |
microbatch | no | yes (by event_time) | yes | 1.9+; keyed on event_time config |
delete+insert | yes | yes | yes | 1.11+; alternative to merge |
on_schema_change behavior
| Value | New column upstream | Column type changed upstream | Column removed upstream |
|---|---|---|---|
ignore | Silently dropped | Error | Silently dropped |
fail | Error | Error | Error |
append_new_columns | Added to target | Error | Silently dropped |
sync_all_columns | Added | Error (use --full-refresh) | Removed from target |
Danger
ignore is a silent data-loss path. Prefer fail or append_new_columns. If you pick ignore, document why next to the config() block so the next reader understands it was deliberate.
materialized_view
{{ config(
materialized='materialized_view',
refresh_schedule={'cron': '0 * * * *', 'time_zone': 'UTC'},
tblproperties={'pipelines.autoOptimize.managed': 'true'}
) }}
| Option | Default | Accepted values |
|---|---|---|
materialized | required | materialized_view |
refresh_schedule | None | dict with cron + time_zone, or None for on-demand |
tblproperties | {} | dict; pipelines.* keys work |
databricks_compute | default | compute name |
Note
materialized_view delegates to Delta Live Tables. You give up direct control of the refresh mechanism; Databricks decides how to incrementalize. The upside is no is_incremental(), no unique_key debate, no schema-drift handling.
streaming_table
{{ config(
materialized='streaming_table',
tblproperties={'pipelines.reset.allowed': 'false'}
) }}
select * from stream({{ ref('events_raw') }})
| Option | Default | Accepted values |
|---|---|---|
materialized | required | streaming_table |
tblproperties | {} | dict; pipelines.* keys |
databricks_compute | default | compute name |
Source must be a streaming-compatible Delta table or a registered streaming source. stream() wraps the ref.
ephemeral
{{ config(materialized='ephemeral') }}
Not a real materialization. dbt inlines the model's SQL as a CTE into every downstream model's compiled SQL.
| Option | Default | Accepted values |
|---|---|---|
materialized | required | ephemeral |
tags | [] | list |
Has no storage, no alias, no schema. Cannot be tested directly; the tests fold into the downstream consumer's SQL.
Compute routing
Define named compute in profiles.yml:
outputs:
prod:
databricks_compute:
default: { http_path: /sql/1.0/warehouses/serverless-small }
heavy: { http_path: /sql/1.0/warehouses/serverless-large }
python: { http_path: /api/2.1/clusters/all-purpose-cluster-id }
Pin a model to named compute:
{{ config(databricks_compute='heavy') }}
Serverless SQL warehouses are the right default for SQL models. All-purpose clusters are for Python models and interactive notebooks, not batch SQL.
Table properties (tblproperties)
Common ones:
| Property | Effect |
|---|---|
delta.autoOptimize.optimizeWrite | Combine small writes into larger files |
delta.autoOptimize.autoCompact | Automatically compact after writes |
delta.enableChangeDataFeed | Enable CDF for downstream streaming consumers |
delta.columnMapping.mode | name or id; enables column renames without rewriting data |
pipelines.autoOptimize.managed | (MV / streaming table) let Databricks manage optimize |
See also
- Materializations concepts — the decision framework.
- Incremental models guide — worked example of
merge. - Model authoring standards — Causeway's defaults per layer.