A raw AI agent reads files and writes files. An agent with MCP (Model Context Protocol) servers runs queries, inspects lineage, opens PRs, comments on issues, and reads dashboards — inside the same conversation that started in your editor. MCP is the glue that turns a chat window into a data-engineering copilot.
This guide walks through wiring the three MCP servers that most data creators want: Databricks managed, dbt Power User embedded, and GitHub. It covers the security model up front, because MCP servers are powerful and lazy configuration is dangerous.
What MCP is
MCP, introduced by Anthropic in 2024 and standardized through 2025, is a protocol for exposing tools to AI agents. A server publishes tools (functions with typed inputs and outputs). A client (the agent, usually via a VS Code extension or CLI) discovers and calls them.
Tools are not magic. They are regular programs — a Python script, a Node server, a Go binary — that speak the MCP protocol over stdio or HTTP. The agent negotiates, calls tools, and pipes the results back into its context.
Note
Treat an MCP server the same way you treat a Python package: understand what it does, know where it came from, pin its version, and audit its permissions. The convenience of "the agent can now run Unity Catalog functions" comes with the cost of "the agent can now run Unity Catalog functions."
The security model in one paragraph
When you wire an MCP server into an agent, the agent gets the server's permissions. If the server authenticates with a PAT that has SELECT * on every table in Unity Catalog, the agent can read every table. If the server has write access to GitHub, the agent can open PRs. Scope the server's credentials as tightly as possible, and treat the credentials as secrets — which they are.
Where configuration lives
Each agent reads MCP configuration from a different place. The shape is the same:
| Agent | Config file | Scope |
|---|---|---|
| Claude Code | ~/.claude/mcp.json (user), .mcp.json (project) | Project overrides user |
| Cursor | .cursor/mcp.json | Project only |
| Continue.dev | .continue/config.json → mcpServers | Project |
| Cline | Cline settings → MCP servers | User |
Project-scoped configuration beats user-scoped configuration for anything org-specific. User-scoped configuration is fine for personal utility servers (like a weather MCP, if you must).
Wiring Databricks managed MCP
Databricks ships managed MCP servers in 2026. These expose Unity Catalog functions, Vector Search indexes, Genie conversations, and direct DBSQL query execution. Permissions flow through UC, so the agent queries what the authenticated user can query and nothing more.
The config
{
"mcpServers": {
"databricks": {
"url": "https://<workspace-host>/api/2.0/mcp/servers/catalog-name",
"transport": "sse",
"headers": {
"Authorization": "Bearer ${DATABRICKS_TOKEN}"
}
}
}
}
The server lives in Databricks' control plane; your agent connects over SSE with a bearer token. The token is a short-lived OAuth access token, not a long-lived PAT.
Important
Do not paste a long-lived PAT into this config. Use the Databricks OAuth flow and have the extension refresh the token automatically. A PAT in an MCP config leaks to wherever the config syncs, which is usually farther than you intended.
What the agent can do
- Query any UC table the user can read. Via DBSQL, not the warehouse UI.
- Call any UC function the user has
EXECUTEon. - Run Vector Search queries against UC-governed indexes.
- Open a Genie conversation for natural-language analytics.
What the agent cannot do
- Bypass UC permissions. If the user cannot
SELECTfrommain.pii.customer_email, neither can the agent. - Write to tables. Managed MCP is read-only by default in 2026.
- Execute arbitrary notebooks.
Scoping
Create a dedicated service principal for MCP use. Grant it USE CATALOG and USE SCHEMA on the catalogs and schemas the agent needs, and SELECT on the tables it needs. Nothing else.
Warning
The temptation to grant SELECT on main.* to get started is strong. Resist it. An agent with broad read grants is a compliance incident when the first prompt that leaks data arrives.
Wiring dbt Power User's embedded MCP
dbt Power User (innoverio.vscode-dbt-power-user) ships an embedded MCP server that exposes:
- The parsed project graph (models, sources, exposures, metrics).
- Compiled SQL for any model.
- The test runner (
dbt test --select ...). - Column-level lineage via Altimate's lineage engine.
- Documentation generation helpers.
The server starts when the extension activates. No separate process to manage.
The config (Cursor example)
{
"mcpServers": {
"dbt": {
"command": "uvx",
"args": ["dbt-power-user-mcp"],
"env": {
"DBT_PROJECT_DIR": "${workspaceFolder}",
"DBT_PROFILES_DIR": "${env:HOME}/.dbt"
}
}
}
}
Alternative: the extension exposes the MCP endpoint on a local port; point the agent at that port.
What the agent gains
- "Which models consume
stg_subscription_events?" — the agent queries the lineage graph. - "Is this column used downstream?" — the agent checks column-level lineage.
- "Generate tests for
int_customer_months." — the agent reads the schema, proposes tests. - "Compile this model and show me the SQL." — the agent calls the compile tool.
Scoping
The server inherits whatever dbt target the workspace is configured for. Use a read-only dev target — a user with SELECT on the dev schemas and no write grants anywhere. An agent that can accidentally dbt run against prod is an agent you will regret.
Wiring GitHub MCP
The GitHub MCP server (Anthropic's official one, or the community github-mcp) exposes:
- Issue and PR listing, searching, commenting.
- File reads from any repo the user has access to.
- Branch creation, commits, PR opening.
- Actions workflow listing and re-runs.
The config
{
"mcpServers": {
"github": {
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-github"],
"env": {
"GITHUB_PERSONAL_ACCESS_TOKEN": "${env:GITHUB_TOKEN}"
}
}
}
}
Scoping
The token's scope limits the agent's capabilities. Start narrow:
- Read-only (
public_repo,read:org): the agent can read but not write. Safe default. - Commit/PR (
repo): the agent can push branches and open PRs. Grant only for repos where agent-authored PRs are wanted.
Warning
Never give an agent a token with admin:org or delete_repo. No prompt ever needs those permissions, and the damage from a rogue tool call is catastrophic.
Other MCP servers worth knowing
- Slack — read messages, post to channels. Useful for "summarize today's #data-incidents."
- Jira — query issues, update ticket status. Useful for data-ops workflows.
- Snowflake — same pattern as Databricks managed MCP but for Snowflake.
- Postgres / MySQL — generic SQL MCP servers that take a connection string and a read-only user.
- Filesystem — a local filesystem MCP (built into most agents already) for repos the agent should read outside the current workspace.
Testing an MCP server
Before trusting an MCP server, inspect what tools it publishes:
npx @modelcontextprotocol/inspector <server-command>
The inspector opens a web UI showing every tool the server exposes, its inputs, and its outputs. Confirm the list matches what you expect. A server that publishes execute_sql when you thought it only did read_table is a server you do not yet understand.
A full example: Claude Code with three servers
~/.claude/mcp.json:
{
"mcpServers": {
"databricks": {
"url": "https://dbc-abc123.cloud.databricks.com/api/2.0/mcp/servers/main",
"transport": "sse",
"headers": {
"Authorization": "Bearer ${DATABRICKS_TOKEN}"
}
},
"dbt": {
"command": "uvx",
"args": ["dbt-power-user-mcp"],
"env": {
"DBT_PROJECT_DIR": "${workspaceFolder}"
}
},
"github": {
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-github"],
"env": {
"GITHUB_PERSONAL_ACCESS_TOKEN": "${env:GITHUB_TOKEN}"
}
}
}
}
Now a prompt like:
Find the dbt model that owns the
customer_ltvcolumn. Check whether the latest PR touched it. Compile it and confirm the SQL uses our standard subscription join.
exercises all three servers in one turn.
Version pinning
MCP servers ship as npm packages, Python packages, or container images. All drift. Pin versions:
"args": ["-y", "@modelcontextprotocol/server-github@1.4.2"]
Tip
Every MCP server update is a potential behavior change. An agent whose tools behave differently is an agent whose outputs become unreproducible. Pin versions; upgrade deliberately.
Governance
A paved-path MCP configuration in an org looks like:
- A published template (
mcp.json.template) with the approved servers and the env-var placeholders. - OAuth flows everywhere secrets are involved. No PATs in configs.
- Per-project service principals with least-privilege grants in UC / GitHub / Slack.
- A quarterly audit of which servers are deployed and what tools they publish. Servers that no one uses or whose scope has crept beyond need get removed.
- A "block list" of known-bad MCP servers the org should not install.
Important
Every new MCP server in your config is a new row in your attack surface. Treat additions the way you treat adding a new dependency to production: proposed, reviewed, approved.
Anti-patterns
- "Give the agent a root token so it can do everything." It can also break everything.
- Checking
mcp.jsonwith a real bearer token into Git. The token is now public to everyone with repo access and forever, through the Git history. - Running an MCP server from a random GitHub repo without auditing. The server is code; audit it before wiring it into every agent session.
- Pointing the agent at a prod dbt target. Inevitable bad day.
See also
- AI agents — the agents that consume MCP.
- Workspace standards — how MCP config fits the paved path.
- Debugging — how to troubleshoot MCP tool failures.