AI agents in VS Code: the 2026 landscape

Concepts VSCode

AI agents are the most talked-about and least well-understood category of developer tooling. By 2026 the landscape has settled into distinct roles. Most experienced data creators run two agents: one for autocomplete, one for multi-file reasoning. Picking both well matters more than picking any one right.

The two jobs

Every agent performs one or both of:

Moment-to-moment autocomplete. Finish the line, suggest the next line, scaffold a function from a signature. Sub-200 ms latency or nothing. You see dozens of suggestions per hour; ~5% of them are accepted.
Multi-file reasoning. "Add a new silver table that joins subscriptions to customers. Update the dbt YAML, the tests, the docs, and the schema.yml." Minutes per response, not milliseconds. Agent reads ten files, writes three, runs one test, reports.

Different agents excel at different jobs. Treat the choice as a pair, not a single decision.

The options

GitHub Copilot

Pros: native VS Code integration, strong autocomplete, ubiquitous in every IDE, cheap ($10/mo per user), org-wide license management through GitHub. The coding agent feature (2025) converts GitHub issues directly into PRs with AI-authored changes. Handles the autocomplete half well.

Cons: multi-file reasoning is weaker than Claude at complex refactors. The agent terminates turns earlier and commits less reliably.

When to pick it: every team gets Copilot as the baseline. It is the lowest-friction agent to adopt and the one your org probably already licenses.

Claude Code

Pros: highest SWE-bench Verified scores. 1M-token context window on Opus handles entire medium repos in a single prompt. Rich hook and MCP ecosystem; configurable skills, subagents, and orchestration. Leads on complex multi-file tasks: "rewrite this dbt project," "refactor this operator," "generate a data contract and all its fixtures."

Cons: no BYOM — you use Anthropic's models or nothing. Terminal-first; the VS Code integration is strong but not native. Priced per-seat at a premium over Copilot.

When to pick it: when the team does substantial refactoring, scaffolding, or architectural work and the quality of multi-file edits matters more than saving $15/mo per seat.

Cursor

Pros: dedicated VS Code fork (not an extension). Supermaven-powered autocomplete is the fastest on the market. Composer mode handles multi-file edits with an inline review UX that no other agent matches. BYOM — plug in Claude, GPT-5, Gemini, or a private endpoint. Most commercially successful AI IDE as of 2026.

Cons: it is a fork — every VS Code update you want requires Cursor to ship its own update. Extension compatibility is high but not perfect. Lock-in risk if the whole team standardizes on it.

When to pick it: when the team is willing to move to a VS Code fork to get the best autocomplete and Composer UX, or when BYOM against a private model endpoint is a hard requirement.

Continue.dev

Pros: fully open-source VS Code extension. BYOM; runs against any OpenAI-compatible endpoint, including a private vLLM deployment or Bedrock. The right choice for regulated environments where code cannot leave the VPC.

Cons: less polish than commercial agents. Autocomplete quality depends on the model you point it at; most on-prem models are weaker than frontier ones.

When to pick it: when your org cannot send code to external vendors. When you have a competent platform team that can host the model and maintain the integration.

Cline

Pros: open-source autonomous VS Code agent with 5M+ installs. Strong at agentic multi-step tasks — it reads, writes, runs the terminal, and iterates. Pairs with BYOM similarly to Continue.

Cons: autonomous agents demand careful prompt design and guardrails; unattended runs can make decisions the user disagrees with. Not a good default for the data-creator who wants a copilot, not a colleague.

When to pick it: when the workflow rewards long-running autonomy (scaffolding an entire repo, migrating a project, running through a backlog of mechanical changes).

Amazon Q

Pros: AWS-native. Knows your Redshift cluster, S3 buckets, Glue jobs. Queries the AWS console in natural language.

Cons: narrower scope outside AWS-specific work. Not the general-purpose agent most data creators want.

When to pick it: when your work is predominantly AWS infra-adjacent (DevOps-leaning data engineering).

The decision matrix

Your constraint	Pick
Cheap, ubiquitous, handles autocomplete well	Copilot
Best multi-file refactors; code can go to Anthropic	Claude Code
Best autocomplete and Composer UX; willing to move to a fork	Cursor
Code cannot leave the VPC; have platform team for hosting	Continue.dev against a private model
Long-running autonomous tasks	Cline
Mostly AWS infra work	Amazon Q

The pragmatic pattern

Most experienced data creators in 2026 run a two-agent stack:

Copilot or Cursor for autocomplete. Always-on, every keystroke, barely noticed.
Claude Code or a BYOM equivalent for non-trivial tasks. Summoned explicitly for refactors, scaffolding, contract authoring, failure triage.

The two agents do not compete; they divide labor. The hybrid is measurably more productive than either alone, and it is still cheaper than paying for every agent on the market.

Skills: the underappreciated half

A generic agent is worth far less than a well-briefed one. What turns a generic agent into a paved-path tool is skills: small, reusable, organization-specific instructions that encode how your team works.

Skill examples:

"Create a new dbt model following our Causeway conventions." Tells the agent to scaffold against stg_ / int_ / fct_ / dim_ naming, add a schema.yml entry, and wire in the default not_null + unique tests.
"Author a new data contract." Encodes the contract schema, the mandatory fields, and where to put the YAML.
"Scaffold a new Databricks job from the paved-path template." Produces a databricks.yml and the matching notebook or Python entry point.
"Triage this failed dbt build." Tells the agent to read the target/run_results.json, group failures by type, and propose a fix per group.

Skills live in .claude/skills/ (Claude Code), cursor/skills/ (Cursor), or .continue/rules/ (Continue). They ship with the repo, get versioned, and evolve under PR review. The platform team publishes the paved-path skills the same way they publish templates.

Important

Do not ask an agent to design your conventions. Ask it to follow them. Skills are how conventions become machine-executable. If every new dbt model your team ships starts from a shared skill, the "junior engineer writes a novel model layout" failure mode disappears.

MCP servers: the tooling half

Agents need tools. An agent that can only read local files and write local files is half the agent it could be. MCP (Model Context Protocol) is the 2025 standard for exposing tools to agents. By 2026, the data-engineering ecosystem supports MCP extensively:

Databricks managed MCP (Unity Catalog functions, Vector Search, Genie, DBSQL).
dbt Power User embedded MCP (project graph, compiled SQL, test runner).
GitHub, Jira, Slack MCP servers from the community.

Configure MCP servers alongside skills. See MCP servers for the wiring.

Governance and security

AI agents change the security model:

Code egress. Every prompt you send to a hosted model leaves your laptop. If the prompt includes a sensitive file or a secret, it is in the vendor's training or eval set.
Agent actions. Autonomous agents run shell commands, write files, and commit to branches. Treat their output the same way you treat a junior engineer's output: reviewable, reversible, never pushed to main without review.
MCP server permissions. The agent inherits every permission the MCP server is configured with. An MCP server with SELECT * on Unity Catalog lets the agent read every governed table. Scope aggressively.

Warning

Never paste a secret into an agent prompt, even "just to check." The secret lands in the model provider's logs. Rotate any secret that leaked into an AI context, and treat the leak as a security event.

The skill of using agents

Agents are force multipliers, not replacements. The engineers who extract the most value from them share three habits:

Small, scoped asks. "Add an incremental model for subscription events" beats "Rewrite the entire Silver layer."
Review every diff. The agent will confidently make the wrong call 10% of the time. Do not merge what you did not read.
Teach the agent your context. Skills, CLAUDE.md or equivalent project instructions, and MCP servers. The more the agent knows, the better it performs — and that work compounds across every session.

Treat the agent the way you treat any other power tool: it makes a skilled operator faster, and an unskilled operator dangerous.