Organize models into three layers. Fight the urge to invent a fourth.
models/
staging/ # one model per source table; light cleanup only
intermediate/ # reusable joins, business-logic building blocks
marts/ # facts, dims, semantic-layer-ready tables
Staging
A staging model is one-to-one with a source table. It renames columns, casts types, coerces nulls. It does not join, aggregate, or filter.
- Named
stg_<source>__<table>. - Materialized as
viewby default. Staging views cost nothing; every downstream reference reads fresh source data through them. - Lives exactly once per source. If you find yourself duplicating staging logic, you have overloaded two source tables into one model.
Staging's job is to make every downstream model talk to a clean, consistent shape. Type casts, _loaded_at naming, coalesce on known-null columns — all of it happens here.
Intermediate
An intermediate model composes staging models into reusable building blocks. It exists to be referenced by more than one mart.
- Named
int_<verb>__<noun>, for exampleint_orders__with_customer_region. - Materialized as
ephemeralby default (inlined as a CTE) orviewwhen the logic is reused enough that observability in the DAG matters. - Never exposed to downstream consumers. If a BI tool is reading an
int_model, promote it to a mart or rename it.
A healthy project has maybe a quarter as many intermediate models as marts. If you have an intermediate per mart, your marts are doing too little.
Marts
Marts are what downstream tools read. They are facts (fct_<grain>), dimensions (dim_<entity>), or aggregations.
- Materialized as
table,incremental,materialized_view, orstreaming_table. The layer earns the storage cost because everything downstream reads it. - Contracted (
contract: enforced: true) when consumed externally. Versioned when the contract breaks. - Liquid-clustered on the columns consumers filter or join by.
The fourth-layer temptation
Teams invent a fourth layer when intermediate feels "not quite right". Common names: metrics/, aggregates/, presentation/, semantic/.
Resist. The three layers compose any pipeline you need:
- Pre-aggregation for dashboards → mart (
agg_prefix if you want, still a mart). - Reusable business logic → intermediate.
- Multiple variants of a fact → versioned marts (
fct_revenue_v1,fct_revenue_v2).
Every extra layer is a new naming convention, a new team debate, and a new onboarding paragraph. The payoff is usually zero.
Mapping onto the medallion architecture
Databricks users will recognize bronze / silver / gold from the medallion pattern. dbt's three layers map onto the medallion layers, but they are not the same thing:
| medallion | describes | dbt layer | describes |
|---|---|---|---|
| Bronze | Raw ingested data, as-is | (outside dbt) | loaded by Auto Loader, DLT, Fivetran |
| Silver | Cleaned, deduplicated, type-consistent | staging + intermediate | the shape dbt's DAG imposes |
| Gold | Business-ready, aggregated | marts | what consumers read |
Most Causeway teams land bronze outside dbt (it is a data ingestion concern, not a modeling one) and let dbt own silver-to-gold.
Catalog and schema layout
On Unity Catalog, carve environments into catalogs and domains into schemas.
- One catalog per environment:
dev,staging,prod. - Schemas inside each catalog for domains:
prod.finance,prod.product,prod.ops. - The same model promotes cleanly across environments because only the catalog changes. Set
catalogandschemaviatarget:inprofiles.yml.
Note
Unity Catalog stitches column-level lineage across dbt runs automatically. A mart that reads two staging views produces two edges in UC lineage without any extra configuration. Resist building a parallel lineage layer.
See also
- Materializations — picking the right one for each layer.
- Model authoring standards — Causeway's naming + structure rules.
- The contract triple — how marts expose themselves to downstream consumers.