DirectQuery means every visual interaction runs a SQL query against Databricks. Done wrong it feels like dialup; done right it feels close to Import. This guide is the tuning playbook.
When you are reading this
You ended up on DirectQuery because one of the following applied (see storage modes):
- The fact table is tens of TB and does not fit in capacity memory.
- Users need minute-level freshness.
- Policy keeps data in Databricks.
If none of those is true, go back to Import. DirectQuery tuning is tractable but not free; Import is already fast.
1. Warehouse
DirectQuery performance is floored by the warehouse. Three things must be true:
Serverless SQL warehouse
Serverless only. Pro and Classic cold-start too slowly for the "click a slicer, get a result" UX. See SQL warehouse reference.
Photon on
Photon accelerates the aggregation-heavy SQL Power BI emits by 2-8×. On Serverless it is always on; on Pro, turn it on.
Warm during report hours
Set min_num_clusters: 1 and auto_stop_mins: 5-10. Cold starts are visible to end users; a scale-from-zero warehouse on an 8am dashboard hit means every executive sees 6 seconds of spinner.
Warning
"Auto-stop" is not the enemy; long cold-start is. Serverless scale-to-zero cold-starts in 2-6 seconds, which is acceptable for interactive BI. Pro cold-starts in ~4 minutes, which is not. If you are on Pro, keep the warehouse warm during report hours; if you are on Serverless, aggressive auto-stop is fine.
2. Model the fact right
The single largest DirectQuery performance determinant is the shape of the underlying tables.
Star schema, not snowflake
Power BI's query engine optimizes specifically for stars. Snowflakes produce worse query plans even with a perfect warehouse.
- Facts in the middle, dimensions radiating out.
- Wide dimensions, narrow facts.
- Conformed dimensions (
dim_date,dim_customer) shared across facts.
See model authoring standards for the full rule set.
Surrogate keys, integer-typed
Every relationship should be on an integer surrogate key. Strings compress worse and join slower; VertiPaq's dictionary encoding works best on small integer domains.
No calculated columns on the fact
Calculated columns materialize at refresh (Import) or per query (DirectQuery). Either way, the correct home for derived columns is upstream: a dbt model, a Databricks view, a metric view. They compress better and do not bloat query plans.
3. Make the warehouse help
The warehouse can only be as fast as the data it sees lets it. Four knobs.
Liquid clustering on fact tables
ALTER TABLE prod.gold.fct_orders CLUSTER BY (customer_id, order_date);
Liquid clustering adapts as query patterns shift. Reserve partition_by for hard isolation requirements (GDPR region partitions); liquid clustering is the 2026 default.
Keep statistics current
ANALYZE TABLE prod.gold.fct_orders COMPUTE STATISTICS FOR ALL COLUMNS;
Run this after a schema change or a large data load. Stale statistics make the optimizer pick wrong plans; no warehouse size fixes that.
Narrow the scan
Every DirectQuery visual emits SQL with a WHERE clause derived from the slicers and filters on the page. If those filters do not map to columns the warehouse can prune on, you scan the whole fact.
Audit: for every facing visual, check the compiled SQL in the query profile (Databricks SQL Editor → Query History → Query Profile). If the scan reads all 10B rows to aggregate one region, add a partition filter or a Z-order to the table.
Respect the row budget
A DirectQuery visual's result set is capped by Power BI (default 1M rows). If you return more, the visual silently truncates. Keep aggregations in the SQL, not in DAX.
4. Automatic aggregations
Turn on automatic aggregations for any DirectQuery semantic model with more than a few hundred million fact rows. Power BI watches query patterns, materializes the top-N aggregation tables in memory, and serves matching queries from cache.
Model view → Manage aggregations → Enable automatic aggregations
This is free performance if your model is a clean star. If it is not, aggregations will be subtly wrong; fix the model shape first.
Tip
After enabling automatic aggregations, let them bake for a week against real usage. The service chooses which aggregation tables to build based on actual query patterns, not your predictions. You get better recommendations from a week of prod traffic than from any amount of desktop modeling.
5. Dual on the dims
One table's storage mode is not the whole model's storage mode. Put your facts in DirectQuery and your dimensions in Dual. Slicer changes resolve instantly against in-memory dim copies; only fact scans round-trip.
See storage modes for the canonical deployment.
6. Shape the DAX
The five-rule DAX tour from the DAX reference matters more in DirectQuery than in Import. Specifically:
CALCULATE(SUM(x), filter)is faster thanSUMX(FILTER(table, predicate), x)because the first pushes the filter to storage.DIVIDE()instead of/handles zeros and is storage-engine-friendly.- Measures, not calculated columns;
VARfor anything reused. - No bidirectional relationships; use
CROSSFILTER()inside a specific measure when the behavior is truly local.
7. Avoid the Databricks.Query trap
Danger
Databricks.Query (custom SQL) is not supported in DirectQuery. A model that uses it builds locally and publishes, then fails at query time in the service. Always model against Databricks.Catalogs(...) and let Power BI generate the SQL. Put custom SQL in a Databricks view; the connector reads it like any other table.
8. Measure it
"I tested it in the report" is not a performance check. Use real tools.
DAX Studio
- Server timings view: milliseconds per storage engine call, per formula engine step.
- Query plan view: the actual plan Power BI builds for a given query.
- Run common visuals' underlying DAX and read the timings before shipping.
Tabular Editor 3
- Best Practice Analyzer (BPA) rules flag anti-patterns before they hit prod. The default rule set is strong; extend with org-specific rules.
- Gate PRs on BPA: no new measures merge with open BPA violations.
Query history
In Databricks SQL → Query History, filter by the service principal your gateway uses. Sort by duration; look at the p95. A p95 above 3-5 seconds means at least one visual is doing work that would survive a tune-up.
9. Monitoring
Production DirectQuery models need SLIs. A reasonable starter:
| Metric | Target |
|---|---|
| p95 visual render time | < 3s |
| p99 visual render time | < 8s |
| Warehouse queued-queries peak | 0 sustained |
| Refresh success rate (for hybrid models with Imported dims) | > 99.5% |
Wire these into your observability stack (Grafana, Datadog, or the Causeway-internal dashboard) as first-class SLOs, not post-hoc investigations.
Common mistakes
| Symptom | Root cause |
|---|---|
| Visuals take 10+ seconds | Cold warehouse, or DirectQuery scanning the full fact |
| One slicer change freezes the whole report | Dims in DirectQuery instead of Dual |
| Different visuals show different numbers | Databricks.Query source is bypassing Power BI's query generation |
| Everything slow after a Power BI upgrade | Connection flipped back to ODBC; check the driver |
| "Too many rows" errors | Aggregation happening in DAX instead of SQL; push it to the warehouse |
See also
- Storage modes — should you be on DirectQuery at all?
- Connectivity — ADBC is step zero.
- SQL warehouse reference — sizing and config.
- DAX reference — the rules for measures.