Databricks Asset Bundles (DAB), rebranded in 2026 as Declarative Automation Bundles, are the deployment unit on Databricks. Every job, pipeline, warehouse, dashboard, and ML model ships via a bundle. Clicking things into the UI is not a supported workflow at Causeway.
This guide walks through a realistic bundle from empty directory to deployed prod, with the decisions you have to make at each step.
1. The canonical layout
my_project/
databricks.yml # top-level bundle + target definitions
resources/
jobs/
ingest.job.yml
transform.job.yml
pipelines/
silver.pipeline.yml
src/
mypkg/
__init__.py
transforms.py
io.py
tests/
test_transforms.py
targets/ # optional per-target overrides
dev.yml
staging.yml
prod.yml
Two rules from the outset:
- Separate bundle YAML from Python.
databricks.ymlandresources/*.ymldescribe what to deploy.src/contains the code. They have different review concerns; keep them in different files. - One bundle per deployable unit of work. A team with two independent products gets two bundles. One bundle can have many jobs; a bundle is not a namespace.
2. The bundle spine
Open databricks.yml:
bundle:
name: analytics-platform
variables:
warehouse_id:
description: SQL warehouse for transforms
catalog:
description: Unity Catalog for this env
include:
- resources/**/*.yml
targets:
dev:
mode: development
default: true
workspace:
host: https://dev.cloud.databricks.com
variables:
catalog: dev_analytics
warehouse_id: abc123
staging:
mode: production
workspace:
host: https://staging.cloud.databricks.com
run_as:
service_principal_name: sp-analytics-staging
variables:
catalog: staging_analytics
warehouse_id: def456
prod:
mode: production
workspace:
host: https://prod.cloud.databricks.com
run_as:
service_principal_name: sp-analytics-prod
variables:
catalog: prod_analytics
warehouse_id: ghi789
Three things are load-bearing in that file. Understand all three before shipping.
mode: development vs mode: production
mode: developmentprepends[dev ${workspace.current_user.userName}]to every resource name. Twenty engineers can deploy the same bundle simultaneously without colliding; each gets namespaced resources.mode: productionrefuses to deploy under a human identity. Arun_asblock with a service principal is required. The resources deploy un-prefixed; breaking production is therefore a deliberate act.
Danger
Never set mode: production on the dev target. A single bundle deploy -t dev from a laptop would then clobber any shared dev resource. Dev targets are always mode: development. Staging and prod are always mode: production with service-principal run_as.
run_as
The identity the job executes under. For prod, always a service principal with scoped grants, never a human.
Variables
Declare once under variables:, populate per-target, reference as ${var.catalog} in resource files. One bundle, three target files, zero sed in your pipeline.
3. A job
resources/jobs/transform.job.yml:
resources:
jobs:
transform:
name: transform_${bundle.target}
tags:
environment: ${bundle.target}
owner: data-engineering
cost_center: DE-001
email_notifications:
on_failure:
- data-oncall@causeway.dev
tasks:
- task_key: run_transforms
python_wheel_task:
package_name: mypkg
entry_point: main
libraries:
- whl: ../dist/*.whl
environment_key: serverless
environments:
- environment_key: serverless
spec:
client: "1"
dependencies:
- duckdb==1.0.0
- delta-spark==3.2.0
schedule:
quartz_cron_expression: "0 0 * * * ?"
timezone_id: UTC
pause_status: UNPAUSED
Two defaults worth calling out:
- Serverless jobs compute via
environment_key: serverless. Autoscaling and Photon come for free; cold starts are single-digit seconds. - Tags carry cost attribution.
cost_centerends up on the DBU billing line.
4. Validate
databricks bundle validate
Catches:
- YAML syntax errors.
- Missing variable values.
- Undefined references (
${resources.jobs.nope.id}). - Permission errors on resources the target workspace cannot create.
- Type-check: every required field present and well-typed.
Validation runs in about three seconds. It is the single cheapest way to catch every deploy-time failure class.
Warning
bundle validate is step one of every CI run. If your CI pipeline deploys without validating first, a YAML-level mistake turns into a half-applied deploy that you then have to unwind. Make the pipeline order validate → test → deploy, without exception.
5. Deploy
Development target, scoped to your user:
databricks bundle deploy --target dev
databricks bundle run transform
Production target, run via CI with OIDC federation:
# .github/workflows/deploy-prod.yml
jobs:
deploy:
permissions:
id-token: write # for OIDC
contents: read
steps:
- uses: actions/checkout@v4
- uses: databricks/setup-cli@main
- run: databricks bundle validate --target prod
- run: databricks bundle deploy --target prod
Important
Authenticate CI with workload identity federation (OIDC from GitHub Actions / Azure DevOps to a Databricks service principal). Never put long-lived personal access tokens in CI secrets. PATs get leaked, rotated late, and are scoped to individuals who leave.
6. Pin compute and libraries
Hard-code nothing. Use variables:
# databricks.yml
variables:
runtime_version:
default: 15.4.x-scala2.12
wheel_version:
description: version of mypkg to deploy
# resources/jobs/transform.job.yml
tasks:
- task_key: run
new_cluster:
spark_version: ${var.runtime_version}
libraries:
- whl: ../dist/mypkg-${var.wheel_version}-py3-none-any.whl
7. Bundle state is the workspace
The bundle deploy process is stateful: it tracks which resources the bundle created and updates them in place on subsequent deploys. Two consequences:
Danger
Treat the workspace UI as read-only for bundle-managed resources. If you click-edit a job that a bundle deploys, the next bundle deploy will overwrite your edits. This routinely leads to lost hotfixes. When a change is needed urgently, make it in the bundle, merge, deploy. Do not patch the UI.
Warning
Lock prod deploys to CI. A team member running bundle deploy --target prod from a laptop with the wrong checkout can clobber production resources. Only the CI service principal should have the grants to deploy to prod targets.
8. Rollback
Every prod deploy tags the commit it came from. Rollback is bundle deploy at the previous tag:
git checkout v2026.04.15
databricks bundle validate --target prod
databricks bundle deploy --target prod
If a rollback takes longer than five minutes you are doing it wrong; speed here is a first-class concern for incident response.
9. Clean up
# Tear down a specific target
databricks bundle destroy --target dev
# Inspect what would be destroyed first
databricks bundle destroy --target dev --dry-run
Common mistakes
| Symptom | Root cause |
|---|---|
bundle deploy fails with PERMISSION_DENIED | Service principal lacks grants on the target workspace. Grant USE CATALOG + CREATE JOB on the target catalog/workspace. |
| Two developers' resources collide in dev | mode: development is missing; target deployed un-prefixed. |
| Prod deploy overwrites a manual UI fix | Expected. Make the fix in the bundle; deploy. |
| Variables not resolving | Missing --target flag; the default target may not define the variable. |
| CI deploys succeed but jobs fail on first run | Libraries not built before bundle deploy. Build the wheel in a step before the deploy step. |
See also
- Databricks quickstart — your first bundle.
- Compute types — picking the right runtime for each job.
- Production readiness — the full checklist for shipping a bundle to prod.