This is a lookup table, not a tutorial. For the decision framework, see the compute types concepts page.
Types
| Type | Compute | Cold start | Autoscaling | Photon | 2026 status |
|---|---|---|---|---|---|
| Serverless | Databricks-managed | 2–6s | Per-query (IWM) | Always on | Preferred default |
| Pro | Customer VPC | ~4 min | Per-cluster | Optional | Use when Serverless not available |
| Classic | Customer VPC | 4–8 min | Per-cluster | Optional | Legacy; do not pick for new |
Sizing (t-shirt)
| Size | Cluster units | Approx concurrent queries | Use case |
|---|---|---|---|
| 2X-Small | 1 | 10 | Development, light ad-hoc |
| X-Small | 1 | 10 | Small team BI |
| Small | 2 | 20 | Medium BI workloads |
| Medium | 4 | 40 | Production dashboards |
| Large | 8 | 80 | Heavy BI, many concurrent users |
| X-Large+ | 16+ | 160+ | Special cases only |
Rule of thumb: start at Medium. Downsize once you observe saturated-but-not-queued behavior.
Scaling knobs
| Knob | Default | Recommended | Purpose |
|---|---|---|---|
min_num_clusters | 1 | 1 | Idle capacity floor; 1 keeps first query warm |
max_num_clusters | 1 | 2–5× expected peak concurrency / 10 | Headroom for load spikes |
auto_stop_mins | 10 | 5 (serverless) / 10 (pro) | Balance cost vs. cold-start latency |
Peak Queued Queries is the metric to watch. If it climbs above zero under load, raise max_num_clusters before bumping t-shirt size.
Config via CLI
databricks sql warehouses edit <warehouse-id> --json '{
"name": "prod-bi-warehouse",
"size": "MEDIUM",
"min_num_clusters": 1,
"max_num_clusters": 5,
"auto_stop_mins": 5,
"enable_photon": true,
"enable_serverless_compute": true,
"warehouse_type": "PRO",
"tags": {
"cost_center": "DE-001",
"environment": "prod"
}
}'
Config via Asset Bundle
resources:
warehouses:
bi:
name: prod-bi-warehouse
cluster_size: MEDIUM
min_num_clusters: 1
max_num_clusters: 5
auto_stop_mins: 5
enable_photon: true
enable_serverless_compute: true
tags:
custom_tags:
- key: cost_center
value: DE-001
- key: environment
value: prod
Prefer bundle-managed warehouses over hand-edited ones. See Asset Bundles.
One warehouse per workload class
Causeway convention. Teams share warehouses within a workload class; teams do not each have their own warehouse.
| Warehouse | Workload | Typical config |
|---|---|---|
wh-bi | BI tools, dashboards, many concurrent users | Medium, min: 1, max: 5, auto_stop: 5 min |
wh-elt | dbt, scheduled transforms, one consumer | Large, min: 1, max: 1, auto_stop: 10 min |
wh-adhoc | Analyst ad-hoc, exploration | Small, min: 0, max: 2, auto_stop: 2 min |
Photon
- Serverless: Photon is always on. No override.
- Pro / Classic: opt-in via
enable_photon: true. - When to disable: only if queries consistently finish in under two seconds and the Photon startup tax outweighs the runtime gain. Rare.
Access control
Warehouse access is a UC permission:
-- Granted via UI or SQL
GRANT CAN_USE ON WAREHOUSE `prod-bi-warehouse` TO `bi-team`;
GRANT CAN_MANAGE ON WAREHOUSE `prod-bi-warehouse` TO `data-platform`;
CAN_USE lets a user submit queries. CAN_MANAGE lets them edit config and grants.
Query history
Every query against a warehouse lands in system.query.history:
-- Longest-running queries in the last 24 hours
SELECT
query_id,
user_name,
query_text,
duration / 1000 AS duration_seconds,
rows_produced,
status
FROM system.query.history
WHERE warehouse_id = '<warehouse-id>'
AND start_time > CURRENT_TIMESTAMP() - INTERVAL 24 HOURS
ORDER BY duration DESC
LIMIT 20;
-- Query volume + failure rate by hour
SELECT
DATE_TRUNC('hour', start_time) AS hour,
COUNT(*) AS query_count,
AVG(duration) / 1000 AS avg_duration_sec,
SUM(CASE WHEN status = 'FAILED' THEN 1 ELSE 0 END) AS failed_count
FROM system.query.history
WHERE warehouse_id = '<warehouse-id>'
AND start_time > CURRENT_TIMESTAMP() - INTERVAL 7 DAYS
GROUP BY 1
ORDER BY 1 DESC;
Monitoring signals
| Signal | Threshold | Action |
|---|---|---|
| Queued queries > 0 sustained | Raise max_num_clusters | |
| p95 query duration drifting up | Investigate query plans; add partition filters / Z-order | |
| Failed query rate > 1% | Read query_history.error_message; classify errors | |
| Idle time > 30% of runtime | Lower auto_stop_mins; consider downsizing | |
Cluster count pegged at max_num_clusters | Raise max_num_clusters or split workloads across warehouses |
Statistics and optimization
Warehouses perform best on tables that have current statistics and clustered storage:
-- Compute statistics for the planner
ANALYZE TABLE prod.gold.events COMPUTE STATISTICS FOR ALL COLUMNS;
-- Z-order on the filter columns
OPTIMIZE prod.gold.events ZORDER BY (customer_id, event_type);
-- Liquid clustering (preferred on new tables)
ALTER TABLE prod.gold.events CLUSTER BY (customer_id, event_type);
See also
- Compute types — SQL warehouse vs job compute vs all-purpose.
- dbt quickstart — connecting dbt to a warehouse.
- Production readiness — what a warehouse needs to be prod-ready.