Dependencies: direct, Asset, AssetWatcher

Concepts Airflow

Airflow offers three mechanisms for expressing "run this after that". They look interchangeable in documentation; they are not interchangeable in production.

Mechanism	Scope	Trigger
Direct (`A >> B`)	Intra-DAG	DAG schedule
Asset	Cross-DAG, within Airflow	Producer DAG writes the Asset
AssetWatcher	Cross-system, outside Airflow	External event (SQS, Kafka, S3, UC table)

Pick the lightest mechanism that expresses what you actually need.

Direct dependencies

The classic Python DAG definition with the >> operator:

with DAG(...) as dag:
    extract = ...
    transform = ...
    load = ...
    extract >> transform >> load

Use when:

A and B are tightly coupled steps of one logical pipeline.
You want failure to roll state together; one break fails the whole DAG.
They share a scheduling cadence.

Do not use for:

Cross-team coordination; a DAG cross-linked this way couples release cycles.
Many-to-many fan-in; the DAG becomes unreadable.

Asset dependencies

Event-driven, cross-DAG. DAG B schedules on an Asset. DAG A's final task updates that Asset. When A writes, B triggers.

# Producer (DAG A)
from airflow.assets import Asset

orders_gold = Asset("s3://data-lake-prod/gold/fct_orders/_delta_log/")

with DAG(dag_id="refresh_gold_orders", schedule="@hourly") as dag_a:
    transform = ...
    write = ... >> orders_gold   # writing updates the Asset

# Consumer (DAG B)
with DAG(dag_id="refresh_dashboard", schedule=[orders_gold]) as dag_b:
    refresh_powerbi = ...

Use when:

A and B belong to different teams or domains.
B has many upstream As (fan-in).
You want B to run as soon as A finishes, not on a clock.

This is the canonical pattern for:

dbt build → Power BI refresh.
Silver → gold handoffs across teams.
Cross-project triggering where each project owns its own DAG.

Note

Assets replaced the old ExternalTaskSensor pattern, which held a worker slot while it polled an upstream DAG. ExternalTaskSensor cost executor capacity and coupled teams to the same scheduling interval. Assets cost nothing to wait on; the scheduler triggers DAG B only when the Asset actually updates.

AssetWatcher dependencies

Event-driven, from outside Airflow. DAG B schedules on an AssetWatcher that listens to an external event stream (SQS, Kafka, S3 events, Unity Catalog table updates).

from airflow.assets import AssetWatcher
from airflow.providers.amazon.aws.assets import S3AssetWatcher

new_files = AssetWatcher(
    name="new_orders_in_landing",
    source=S3AssetWatcher(
        bucket_name="causeway-landing",
        key_prefix="orders/",
    ),
)

with DAG(dag_id="process_new_orders", schedule=new_files) as dag:
    process = ...

Use when:

The upstream system is not Airflow (vendor SaaS, an ops team, a data producer you do not orchestrate).
Latency matters: react within seconds, not on a polling interval.
The trigger is not periodic.

This replaces the old poke_interval=30 sensor pattern. Watchers subscribe to real events; they do not burn worker slots polling.

Decision framework

Walk these questions in order:

Are A and B in the same logical pipeline, same cadence, same ownership? → direct dependency.
Is the producer another Airflow DAG? → Asset.
Is the producer outside Airflow entirely? → AssetWatcher.

Reach for TriggerDagRunOperator only when none of the above fits. It re-introduces the tight coupling through the back door.

The anti-pattern: sensors

The old PokeUntilTrue sensor pattern is the most common bug in Airflow code in the wild. It held a worker slot the entire time it waited. At scale, sensors starved real work.

In 2026:

Use deferrable (async) operators and sensors. *Async variants, or deferrable=True, release the worker slot via the Triggerer process.
Replace polling sensors with AssetWatchers when the trigger is truly external. A watcher on an SQS queue is more reliable and cheaper than a sensor poking every 30 seconds.

Danger

A synchronous sensor with no timeout is a production-incident-in-waiting. It holds a worker forever, starves other tasks, and produces no useful log until it eventually times out or someone kills the DAG. Deferrable sensors are not optional; they are the baseline. Older tutorials show the synchronous pattern; ignore them.

The TaskGroup distinction

TaskGroups look like a dependency mechanism but are not. A TaskGroup is a visual grouping of tasks inside one DAG; dependencies between groups are still direct >> dependencies under the hood.

with TaskGroup("ingest") as ingest:
    extract = ...
    validate = ...

with TaskGroup("transform") as transform:
    enrich = ...
    dedupe = ...

ingest >> transform

Use TaskGroups for readability on DAGs with more than ~10 tasks. They are not a cross-DAG mechanism; for that you want Assets.

Assets vs. cross-DAG `TriggerDagRunOperator`

A common pattern in pre-Airflow-2 code:

# The old way; still legal but usually wrong
trigger_downstream = TriggerDagRunOperator(
    task_id="trigger_dashboard",
    trigger_dag_id="refresh_dashboard",
)

Problems:

Tight coupling: the producer knows the consumer's DAG ID.
Breaks fan-in: if three producers trigger one consumer, each fires separately; no aggregation.
Hard to reason about from the consumer side: "what triggers my DAG?" requires searching all producer DAGs.

Assets invert the dependency:

The consumer declares what it depends on.
Producers declare what they publish.
The scheduler figures out the triggering.

Use Assets over TriggerDagRunOperator in every new design. Keep TriggerDagRunOperator for legacy compatibility only.

Summary

Direct >> for cohesive intra-DAG flow.
Assets for DAG-to-DAG coordination inside Airflow.
AssetWatchers for outside-world triggers.
Deferrable operators instead of synchronous sensors, always.
TaskGroups for visual grouping within a DAG, not for cross-DAG dependencies.

Direct dependencies

Asset dependencies

AssetWatcher dependencies

Decision framework

The anti-pattern: sensors

The TaskGroup distinction

Assets vs. cross-DAG TriggerDagRunOperator

Summary

See also

Assets vs. cross-DAG `TriggerDagRunOperator`