Docs · By type
Airflow
Airflow is a supervisor, not an engine
The most common Airflow mistake is treating it as a compute engine. Internalize this one distinction and every other decision follows.
Airflow 3: what changed, what got removed
Airflow 3.0 shipped in April 2025 and reshaped the model. The migration reality, the net-new capabilities, and the things that will break your old DAGs.
Dependencies: direct, Asset, AssetWatcher
Three ways to say 'run B after A'. They are not equivalent. The decision framework for which to reach for when.
dbt
The three dbt project layers
Staging, intermediate, marts — what each layer is for, why you should resist inventing a fourth, and how it maps to the medallion architecture on Databricks.
Materializations: when to use which
dbt-databricks supports five materializations. This is the decision framework for picking the right one per model.
Databricks
Databricks compute: SQL warehouses, jobs, all-purpose
Three physically distinct compute offerings, one right default per workload class, and the decision framework for getting it right the first time.
Lakeflow: Connect, Declarative Pipelines, Jobs
Three products consolidated into one data-engineering plane. Knowing which piece does what prevents architectural flailing.
Unity Catalog: hierarchy, grants, and lineage
The object model that governs every table, view, volume, and function on a Causeway Databricks workspace. What lives where, who can touch what, and why lineage comes for free.
Power BI
Power BI connectivity: ADBC, ODBC, and the native connector
Three ways Power BI can talk to Databricks, one right default, and the traps that trip teams up on AWS deployments and DirectQuery.
Storage modes: Import, DirectQuery, Dual, Direct Lake
Four modes, increasingly overlapping, with a clear decision tree. When Import stops being the right default.
Semantic models: shared, certified, governed
Datasets are called semantic models now and the term matters. The canonical pattern for one model per domain, many thin reports, and how Databricks metric views close the double-definition problem.
VSCode
Edit local, execute remote
The organizing pattern behind a modern VS Code data-creator setup: the editor stays on the laptop, the compute stays in the cloud, and the extensions wire the two together invisibly.
AI agents in VS Code: the 2026 landscape
Copilot, Claude Code, Cursor, Continue, Cline, Amazon Q: what each is good at, how they coexist, and how to avoid paying for four of them.
The VS Code extension ecosystem
How extensions extend, where they live, how to pin and audit them, and why the marketplace is a supply-chain surface that deserves governance.