Airflow: quickstart · Docs

Get started Airflow

This walks you from a clean install to a DAG running locally inside an Astronomer-managed Airflow. It assumes you can read Python and have used a shell, but not that you have ever authored an Airflow DAG.

What you need

Docker Desktop (Astro runs Airflow in containers locally).
Python 3.11 and pip.
The Astro CLI (brew install astro on macOS, or the official installer).

Verify the CLI:

astro version

1. Create a project

mkdir my_airflow && cd my_airflow
astro dev init

Astro scaffolds:

my_airflow/
  dags/                      ← your DAG files go here
    exampledag.py
  include/                   ← shared code (helpers, SQL files)
  plugins/                   ← custom operators/hooks (rarely needed)
  tests/
    dags/test_dag_example.py
  requirements.txt            ← Python deps
  packages.txt                ← system packages (apt-get)
  Dockerfile                  ← Airflow runtime image
  airflow_settings.yaml       ← connections / variables for local dev

Note

The Astro project structure is the same locally and on Astro Cloud. What you test against locally is what runs in production: same image, same providers, same Python deps.

2. Write your first DAG

Replace dags/exampledag.py with:

from __future__ import annotations
import pendulum
from airflow import DAG
from airflow.decorators import task

with DAG(
    dag_id="hello_causeway",
    start_date=pendulum.datetime(2026, 1, 1, tz="UTC"),
    schedule="@hourly",
    catchup=False,
    tags=["example", "quickstart"],
) as dag:

    @task
    def greet(name: str) -> str:
        return f"hello, {name}"

    @task
    def shout(message: str) -> None:
        print(message.upper())

    shout(greet("causeway"))

Three things are load-bearing in that snippet:

start_date uses pendulum with an explicit timezone. A naive datetime(...) runs in UTC; a pendulum-timezoned datetime handles DST correctly.
catchup=False. A DAG paused for a week will not dump 168 backfilled runs into the queue the moment you unpause.
@task decorator. Functions become tasks; the return value of greet flows into shout via XCom automatically.

Warning

catchup=True is the default in many tutorials and is almost always wrong in production. The first time you pause a DAG for maintenance, you will find out why: every missed interval replays at once. Always catchup=False. Handle intentional backfills explicitly with airflow dags backfill.

3. Start Airflow locally

astro dev start

Astro launches the webserver, scheduler, triggerer, and a Postgres metadata database in Docker. First run takes 60 to 90 seconds.

Open http://localhost:8080. Username admin, password admin. You should see your DAG listed.

4. Trigger a run

In the UI, toggle hello_causeway on. Click the trigger button (▶). Click into the DAG → Grid view to watch the two tasks run green.

From the CLI:

astro dev run dags trigger hello_causeway
astro dev run tasks states-for-dag-run hello_causeway <run_id>

5. Read the logs

Click into shout → Logs. You see HELLO, CAUSEWAY printed by the task. The log also shows the XCom pull from greet and the task's start/end timestamps.

6. Iterate

Edit the DAG. Airflow re-parses DAG files every minute by default, and Astro's local environment catches the change on the next parse. No restart needed.

Run the test suite:

astro dev pytest

The scaffolded test_dag_example.py checks that every DAG parses without errors. Extend with your own tests before shipping.

7. Stop Airflow

astro dev stop          # stop containers, keep state
astro dev kill          # stop and wipe state

What just happened

You:

Scaffolded an Astro project with the right structure out of the box.
Wrote a DAG with two tasks using the TaskFlow API (@task).
Started a local Airflow stack in Docker.
Triggered a run and read the logs.

Same project layout deploys to Astro Cloud via astro deploy. See the DAG authoring guide for the rules that apply from your first real DAG onward.

What to resist

Danger

Three anti-patterns to kill on sight:

Top-level code in DAG files. Anything that runs at parse time (API calls, DB queries, file reads) runs every time the scheduler re-parses the DAG, which is constantly. Keep DAG definition files pure Python; do work inside tasks.
datetime.now() in DAG definitions. datetime.now() evaluates at parse time and shifts every parse. Use pendulum fixed dates.
Business logic inside the task callable. Airflow is a supervisor, not a compute engine. If your task processes 5 GB inside the worker, you have built an ETL engine out of a scheduler. See the supervisor model.

Next steps

The supervisor model — Airflow's philosophy in one sentence.
Airflow 3 changes — what is new and what got removed.
DAG authoring guide — idempotency, atomicity, small DAGs.
DAG authoring standards — the rules that go into PR review.