Lakebase is Databricks' managed, serverless Postgres. It closes the one gap the Lakehouse had: low-latency transactional reads and writes without shipping data off-platform.

Under the hood: Postgres compute is decoupled from storage, data lives as open-format files in the lake, and sync tables mirror Delta into Lakebase and back without bespoke ETL. The 2026 Autoscaling generation added scale-to-zero, branching (yes, Postgres branching), and instant restore.

This guide is how to adopt Lakebase without burning yourself.

When to reach for it

Use Lakebase when you need Postgres wire-protocol access to lakehouse data:

Do not use Lakebase for:

The adoption ladder

Do not migrate an existing OLTP system on day one. The Databricks-recommended ladder is:

  1. Read-mostly synced copy. Mirror Delta into Lakebase with sync tables; let apps read from Postgres, keep writes on the existing system. Lowest risk; most teams stop here for quite a while.
  2. Hot-path mirror. Mirror a slice of existing OLTP into Lakebase; compare; hold the old system as fallback.
  3. Primary writes. Cut over once latency, governance, and dev velocity wins justify it.

Warning

Step 3 is a point-of-no-return migration. Run steps 1 and 2 long enough to have real load, real latency numbers, and real cutover dry runs. Cutting over because "it works in dev" has ended more than one quarter early.

1. Provision an instance

databricks lakebase create-instance --json '{
  "name": "prod-serving",
  "catalog": "prod",
  "schema": "serving",
  "size": "MEDIUM",
  "storage_size_gb": 100
}'

Sizes roughly match what RDS users expect:

SizevCPUsMemoryMax connections
Small28 GB100
Medium416 GB200
Large832 GB500
X-Large1664 GB1000

Start at Medium unless you are confident you are below Small's workload. Scale down easier than scale up.

2. Connect from your application

Connection string is standard Postgres:

postgresql://<username>:<password>@<lakebase-host>:<port>/<database>?sslmode=require

Use a connection pool. Lakebase has finite max connections (see the table above) and Python apps without a pool open one per request.

import psycopg2.pool

pool = psycopg2.pool.ThreadedConnectionPool(
    minconn=5,
    maxconn=20,
    host="<lakebase-host>",
    port=5432,
    database="prod",
    user="<service-principal>",
    password="<token>",
    sslmode="require",
)

Rules of thumb:

Note

For production, prefer service principal + OAuth for authentication over a long-lived password. The OAuth flow integrates with Unity Catalog identity, so you get the same grant-based access control as the rest of the platform.

3. Sync a Delta table into Lakebase

Sync tables mirror a Delta table into a Postgres table, incrementally:

-- Inside the Lakebase instance
CREATE SYNC TABLE customers_hot
  FROM prod.silver.customers
  WITH (
    sync_mode = 'INCREMENTAL',
    sync_schedule = 'EVERY 5 MINUTES',
    filter = 'is_active = true AND last_seen_at > CURRENT_DATE - INTERVAL 30 DAYS'
  );

The sync target is a regular Postgres table: you can index it, query it with the full Postgres dialect, and join it against application-owned tables.

The source is a read-only mirror from the app's perspective. If you need to write back to Delta, it is a separate reverse-sync: Postgres to Delta.

4. Indexes and schema migrations

Managed does not absolve you from Postgres fundamentals.

-- Standard Postgres indexing
CREATE INDEX idx_customers_email ON customers_hot (email);
CREATE INDEX idx_customers_active ON customers_hot (is_active) WHERE is_active = true;

Danger

Do not skip connection pooling. A Lambda or Cloud Run function opening a fresh connection per invocation will exhaust the instance's connection limit in minutes under load. Managed Postgres still cares about connection pressure.

5. Branching and restore

Lakebase Autoscaling (March 2026 rollout) added two features that change how you think about dev / test OLTP:

Branching

# Create an instant branch from prod
databricks lakebase create-branch \
  --source prod-serving \
  --name pr-42-branch

# Point an app at the branch for integration tests
export DATABASE_URL=postgresql://.../pr-42-branch
# ... run tests ...

# Tear down
databricks lakebase delete-branch pr-42-branch

Branches share storage with the parent until diverged; they are cheap and fast to create. Per-PR integration tests against a real-shaped database become practical.

Instant restore

Point-in-time restore at any moment in the last 7 days (configurable). Not point-and-click recovery from a 2am incident: literal databricks lakebase restore --at 2026-04-20T03:15:00Z and you have a branch representing that moment.

6. Governance

Lakebase tables live under Unity Catalog, same as Delta tables. Every GRANT you understand from the UC model applies:

GRANT USE CATALOG ON CATALOG prod TO `app-service`;
GRANT USE SCHEMA ON SCHEMA prod.serving TO `app-service`;
GRANT SELECT ON TABLE prod.serving.customers_hot TO `app-service`;

Audit trails land in system.access.audit. Lineage tracks from the Delta source through the sync table to every application that queries it.

7. Cost

Lakebase bills for compute time and storage. Two things matter:

Common mistakes

SymptomRoot cause
App errors: "too many connections"No connection pool, or pool size bigger than instance max.
Sync table rows staleSync schedule is slower than consumer expects. Tighten cadence or switch to continuous sync.
Slow queries despite small tablesNo index on the column the query filters by. Create one; verify with EXPLAIN.
Instance costs more than expectedContinuous sync on a table that could be hourly, or scale-to-zero is disabled.
Schema drift between Delta and PostgresDelta schema evolved; sync table did not. ALTER SYNC TABLE ... REFRESH SCHEMA.

See also