Bytes-level lineage tells you that column A was copied into column B. Semantic lineage tells you whether column A and column B mean the same thing.
Why bytes lie
A dbt macro can rename, cast, coalesce, and coerce a field in a way that technically preserves its bytes but destroys its meaning. If your lineage graph only follows the bytes, it will happily tell you that a test-environment placeholder has flowed into a production KPI.
What we track instead
- Concept identity. Each governed field carries a concept ID, a durable pointer to the business concept it represents.
- Transform intent. Each transform declares what it is doing to the concept: filtering, re-expressing, aggregating, deriving.
- Consumption surface. Metrics and dashboards register which concepts they depend on, not which columns.
Put those three together and you get a lineage graph that survives a rename, a refactor, or a warehouse migration.
LineagedbtSemantic
RT
Rafael Torres
Staff Engineer
Data professional with expertise in analytics, governance, and data platform architecture.
