SQL warehouse reference

This is a lookup table, not a tutorial. For the decision framework, see the compute types concepts page.

Types

Type	Compute	Cold start	Autoscaling	Photon	2026 status
Serverless	Databricks-managed	2–6s	Per-query (IWM)	Always on	Preferred default
Pro	Customer VPC	~4 min	Per-cluster	Optional	Use when Serverless not available
Classic	Customer VPC	4–8 min	Per-cluster	Optional	Legacy; do not pick for new

Sizing (t-shirt)

Size	Cluster units	Approx concurrent queries	Use case
2X-Small	1	10	Development, light ad-hoc
X-Small	1	10	Small team BI
Small	2	20	Medium BI workloads
Medium	4	40	Production dashboards
Large	8	80	Heavy BI, many concurrent users
X-Large+	16+	160+	Special cases only

Rule of thumb: start at Medium. Downsize once you observe saturated-but-not-queued behavior.

Scaling knobs

Knob	Default	Recommended	Purpose
`min_num_clusters`	1	1	Idle capacity floor; 1 keeps first query warm
`max_num_clusters`	1	2–5× expected peak concurrency / 10	Headroom for load spikes
`auto_stop_mins`	10	5 (serverless) / 10 (pro)	Balance cost vs. cold-start latency

Peak Queued Queries is the metric to watch. If it climbs above zero under load, raise max_num_clusters before bumping t-shirt size.

Config via CLI

databricks sql warehouses edit <warehouse-id> --json '{
  "name": "prod-bi-warehouse",
  "size": "MEDIUM",
  "min_num_clusters": 1,
  "max_num_clusters": 5,
  "auto_stop_mins": 5,
  "enable_photon": true,
  "enable_serverless_compute": true,
  "warehouse_type": "PRO",
  "tags": {
    "cost_center": "DE-001",
    "environment": "prod"
  }
}'

Config via Asset Bundle

resources:
  warehouses:
    bi:
      name: prod-bi-warehouse
      cluster_size: MEDIUM
      min_num_clusters: 1
      max_num_clusters: 5
      auto_stop_mins: 5
      enable_photon: true
      enable_serverless_compute: true
      tags:
        custom_tags:
          - key: cost_center
            value: DE-001
          - key: environment
            value: prod

Prefer bundle-managed warehouses over hand-edited ones. See Asset Bundles.

One warehouse per workload class

Causeway convention. Teams share warehouses within a workload class; teams do not each have their own warehouse.

Warehouse	Workload	Typical config
`wh-bi`	BI tools, dashboards, many concurrent users	Medium, `min: 1`, `max: 5`, `auto_stop: 5 min`
`wh-elt`	dbt, scheduled transforms, one consumer	Large, `min: 1`, `max: 1`, `auto_stop: 10 min`
`wh-adhoc`	Analyst ad-hoc, exploration	Small, `min: 0`, `max: 2`, `auto_stop: 2 min`

Photon

Serverless: Photon is always on. No override.
Pro / Classic: opt-in via enable_photon: true.
When to disable: only if queries consistently finish in under two seconds and the Photon startup tax outweighs the runtime gain. Rare.

Access control

Warehouse access is a UC permission:

-- Granted via UI or SQL
GRANT CAN_USE ON WAREHOUSE `prod-bi-warehouse` TO `bi-team`;
GRANT CAN_MANAGE ON WAREHOUSE `prod-bi-warehouse` TO `data-platform`;

CAN_USE lets a user submit queries. CAN_MANAGE lets them edit config and grants.

Query history

Every query against a warehouse lands in system.query.history:

-- Longest-running queries in the last 24 hours
SELECT
  query_id,
  user_name,
  query_text,
  duration / 1000 AS duration_seconds,
  rows_produced,
  status
FROM system.query.history
WHERE warehouse_id = '<warehouse-id>'
  AND start_time > CURRENT_TIMESTAMP() - INTERVAL 24 HOURS
ORDER BY duration DESC
LIMIT 20;

-- Query volume + failure rate by hour
SELECT
  DATE_TRUNC('hour', start_time) AS hour,
  COUNT(*) AS query_count,
  AVG(duration) / 1000 AS avg_duration_sec,
  SUM(CASE WHEN status = 'FAILED' THEN 1 ELSE 0 END) AS failed_count
FROM system.query.history
WHERE warehouse_id = '<warehouse-id>'
  AND start_time > CURRENT_TIMESTAMP() - INTERVAL 7 DAYS
GROUP BY 1
ORDER BY 1 DESC;

Monitoring signals

Signal	Threshold	Action
Queued queries > 0 sustained	Raise `max_num_clusters`
p95 query duration drifting up	Investigate query plans; add partition filters / Z-order
Failed query rate > 1%	Read `query_history.error_message`; classify errors
Idle time > 30% of runtime	Lower `auto_stop_mins`; consider downsizing
Cluster count pegged at `max_num_clusters`	Raise `max_num_clusters` or split workloads across warehouses

Statistics and optimization

Warehouses perform best on tables that have current statistics and clustered storage:

-- Compute statistics for the planner
ANALYZE TABLE prod.gold.events COMPUTE STATISTICS FOR ALL COLUMNS;

-- Z-order on the filter columns
OPTIMIZE prod.gold.events ZORDER BY (customer_id, event_type);

-- Liquid clustering (preferred on new tables)
ALTER TABLE prod.gold.events CLUSTER BY (customer_id, event_type);