Storage modes: Import, DirectQuery, Dual, Direct Lake

Concepts Power BI

Power BI offers four storage modes. They are not interchangeable, and most production incidents trace back to a mode chosen for the wrong reason.

Mode	Data sits in	Query path	When it fits
Import (VertiPaq)	Power BI in-memory	In-memory columnar scan	Most cases; default
DirectQuery	Databricks	SQL on every interaction	Data too large to import, or minute-level freshness
Dual	Both	Import when possible, DirectQuery when needed	Dimensions in mixed-mode models
Direct Lake	Fabric OneLake	Reads Delta files directly	You run Fabric + UC mirroring

Import (VertiPaq)

Data is pulled into Power BI's in-memory columnar engine, VertiPaq. Every visual interaction hits RAM.

Pros: fastest query experience by a wide margin.

Cons: pay with refresh time and capacity memory.

Use when: the model fits in capacity memory and users do not need freshness below the refresh cadence. This is still the right default for most Causeway reports; reach for anything else only when you have a specific reason.

DirectQuery

Power BI generates SQL against Databricks on every visual interaction. No copy in VertiPaq.

Pros: always live; size-of-data is no longer a constraint.

Cons: every visual is a warehouse query. Performance is a shared responsibility between the model and the warehouse.

Use when:

Data is too large to import (tens of TB of fact rows).
Users need freshness below your refresh cadence (minute-level).
Policy requires data to stay in Databricks.

If you choose DirectQuery, budget for the tuning work. See the DirectQuery tuning guide.

Dual (the underrated mode)

A Dual table behaves as Import when it can and as DirectQuery when a visual needs live data.

Canonical deployment:

Large fact tables → DirectQuery (too big to import; must stay live).
Dimensions → Dual (small; slicers and filters resolve against in-memory copies).
Small static fact tables → Import (no value in going remote for a 10k-row lookup).

Done right, this avoids the DirectQuery tax on the 90% of interactions that do not actually need it. Slicer changes resolve instantly against in-memory dim copies; only fact scans round-trip to Databricks.

Tip

Dual is the answer for more mixed-mode models than teams reach for. If you are building a DirectQuery model and the user experience feels sluggish, the first question is always "are the dims in Dual?" If they are in DirectQuery or Import-only, switching them to Dual often recovers most of the lost interactivity for free.

Direct Lake (Fabric)

Direct Lake is Microsoft Fabric's 2026 mode that reads Delta files directly from OneLake. No import refresh, no DirectQuery SQL. Power BI behaves as if the data were imported while physically remaining in Databricks-managed storage via Unity Catalog mirroring to OneLake (GA in 2026).

Use when:

You run (or are adopting) Fabric capacity.
Fact tables are large enough that Import refresh is painful.
You want VertiPaq-like latency without paying VertiPaq memory costs.

Avoid when:

Your Databricks deployment is AWS-only and you are not adopting Fabric.
Your model is heavy on DAX calculated columns that Direct Lake does not support.

Warning

Direct Lake silently falls back to DirectQuery when it hits an unsupported construct (certain DAX calculated columns, some expression shapes, row-level security on mirrored sources). DirectQuery against Delta-in-OneLake is slower than DirectQuery against DBSQL. If Direct Lake is falling back often on your model, you are in the worst of both worlds. Tabular Editor 3 flags fallback risk; use it before publishing.

The decision tree

Walk the questions in order. Stop at the first "yes".

Fits in capacity memory, acceptable refresh cadence? → Import.
Too big to import, or need minute-level freshness? → DirectQuery, with dims in Dual.
On Fabric capacity, and Import refresh is the pain point? → Direct Lake for the facts, Import or Dual for the dims.
Otherwise → Import; revisit if your answer to 1 changes.

Capacity memory vs. table size

"Fits in capacity memory" is the question at step 1. Rough sizing:

Assume ~10x compression over raw Parquet size for a well-modeled star schema.
A 50 GB Parquet fact compresses to ~5 GB in VertiPaq, comfortably fitting a P1/F64 capacity.
A 500 GB Parquet fact is ~50 GB in VertiPaq, which pressures most capacities; that is your Dual/DirectQuery/Direct Lake fork.

The simplest way to know is to import a representative sample, check the model size in Desktop, and extrapolate.

What does not change the answer

Three reflexes teams have that should not push you off Import:

"Users need fresh data." Define fresh. Minute-level is DirectQuery territory. Hour-level can run Import with an hourly refresh trigger from Databricks completion. See the Enhanced Refresh guide.
"The table is big." Big needs a number. 10M rows is small; VertiPaq eats it. 10B rows is DirectQuery territory. Between, measure.
"We want DirectQuery so we do not duplicate data." You are duplicating data either way: either VertiPaq has a compressed copy, or Databricks results are cached in DBSQL and ADBC buffers on every query. Duplication is not the trade; speed is.