This walkthrough gets you from a clean Power BI Desktop install to a published report reading from a Databricks gold table. It assumes you have used SQL before and are comfortable installing desktop software. It does not assume prior Power BI experience.

What you need

Install Power BI Desktop from the Microsoft Store or the download page. Launch it; pick Blank report.

1. Connect to Databricks

Home → Get dataDatabricks (under Database). Enter:

Note

In the 2026 connector, the Advanced options panel exposes the driver. For new connections prefer ADBC over ODBC (Simba). ADBC is Arrow-native end-to-end: lower connection latency, faster metadata discovery, dramatically faster row transfer on large result sets.

Authenticate. The canonical identity for BI users is OAuth (Microsoft Entra ID). The authentication window opens a browser; sign in with your work account.

2. Pick a table

After auth, you see the Unity Catalog navigator: catalogs → schemas → tables. Expand prodgold and select a curated fact and its dimensions, say fct_orders and dim_customers.

Click Transform Data (not Load). The Power Query editor opens.

3. Clean up in Power Query

The only transformations that belong here are shape fixes that cannot live upstream:

Warning

Do not do business logic in Power Query. Every GROUP BY, every merged join, every derived column here bakes the logic into the report rather than the platform. It compresses worse than a Databricks-computed version, it hides from lineage, and it makes the model hard to hand off. If you are reaching for Power Query for calculations, stop; the answer is "push it into dbt or a Databricks view."

Click Close & Apply. The data loads into Power BI's in-memory VertiPaq engine.

4. Define relationships

Model view (left sidebar, third icon down). If relationships did not auto-detect, drag customer_id from fct_orders onto dim_customers.customer_id. Cardinality: many-to-one. Cross-filter direction: single.

Do this for every dimension your fact references. Three rules:

5. Write one measure

In the report view, right-click fct_ordersNew measure:

Total Revenue =
CALCULATE (
    SUM ( 'fct_orders'[amount_usd] ),
    'fct_orders'[status] = "completed"
)

Format: $#,##0. Category: Currency.

Two reflexes from the start:

Avg Order Value =
VAR Revenue = [Total Revenue]
VAR Orders  = DISTINCTCOUNT ( 'fct_orders'[order_id] )
RETURN DIVIDE ( Revenue, Orders )

DIVIDE() instead of / handles zeros cleanly.

6. Build one visual

Insert → Clustered bar chart. Drag dim_customers[region] to Y-axis, [Total Revenue] to X-axis. You now have a revenue-by-region chart fueled by an in-memory columnar store.

7. Save as PBIP (not .pbix)

File → Save As → change Save as type to Power BI project (.pbip). Pick a folder inside your Git-backed repository.

Important

Never commit .pbix files. They are binary and produce unreadable diffs. PBIP is the folder format: .Report holds visuals, .SemanticModel holds the model as TMDL (Tabular Model Definition Language), which is human-readable, diff-friendly, and review-friendly. Always save as PBIP for anything going into source control.

8. Publish

Home → Publish. Pick a workspace (ideally a per-user dev workspace named workspace-dev-<you>). Power BI uploads the model plus the report.

The published model is called a semantic model (formerly "dataset"). Reports connect to semantic models via live connection.

9. Schedule a refresh

In the browser, open the workspace → your semantic model → SettingsScheduled refresh. Add credentials for Databricks (OAuth / service principal), set a daily refresh.

Tip

The UI scheduler is fine for learning. For anything real, drive refresh from your orchestrator via the Enhanced Refresh REST API so the dashboard cannot show stale data after a failed upstream pipeline. See the Enhanced Refresh guide.

What just happened

You:

That is the whole development loop. The same commands scale from "revenue by region" to a governed, shared semantic model with thirty measures and five fact tables.

Next steps