Module 13 of 16

Lineage with dbt Artifacts

Trace impact from source columns to models, metrics, dashboards, and AI answers.

120 minutes1 exercisesFree

Watch as Slides Course overviewInline lab below

Start here

Learning objectives

Explain table, column, metric, and operational lineage
Know what dbt manifest, run_results, and catalog artifacts contain
Use lineage to reason about blast radius

The Mental Model

Lineage is the map of how data flows. It helps you debug wrong numbers, assess change impact, and explain how a metric was produced.

Lineage is like a family tree for data. If one parent changes, you can see which children may be affected.

Tiny Example

We will use a small ecommerce dataset throughout the course. Think of these as the only tables in your first warehouse:

Table	Grain	Example columns
`raw_orders`	one row per order event	`order_id`, `customer_id`, `amount`, `status`, `created_at`
`raw_order_items`	one row per item inside an order	`order_id`, `product_id`, `quantity`, `item_price`
`raw_customers`	one row per customer	`customer_id`, `email`, `country`, `created_at`

Interactive Check

Question: raw_orders.amount changes from dollars to cents. Which downstream objects might be impacted?

Reveal the answer

Any staging model using amount, any fact table deriving revenue, any revenue metric, and all dashboards or AI tools consuming that metric.

Inline Practice Lab

This lab is intentionally small. You can solve it by reading the table, writing the SQL/YAML mentally, or pasting the snippet into any SQL scratchpad later.

-- Example starter table
select
  order_id,
  customer_id,
  amount,
  status,
  created_at
from raw_orders;

The goal is not tooling setup. The goal is learning the production habit: state the grain, clean one thing, test one assumption, and explain the downstream impact.

Self-Check Quiz

What is the grain of the table you are building?
Which downstream metric or dashboard would be wrong if this model broke?
What test would catch the most likely beginner mistake here?

Real world

Where this shows up

Reliable executive dashboards that do not disagree across teams
AI analytics agents that query governed metrics instead of guessing SQL
Auditable metric changes where owners can see downstream impact before merge

Production notes

Keep these close

Use lineage during code review. Ask "what downstream object changes if this column changes meaning?" before merge.

Common mistakes

What usually breaks

Treating lineage as a pretty graph only
Ignoring dashboards and metrics as lineage endpoints
Not capturing run status and freshness alongside structural lineage

Think like an engineer

Questions to answer before shipping

Can you explain the grain of this model in one sentence?
What breaks downstream if this field becomes null tomorrow?
Where should this logic live so it is reused instead of copied?

Key terms

Vocabulary used in this module

Lineage

Metadata describing how data flows from upstream inputs to downstream outputs.

Manifest

A dbt artifact containing project graph and resource metadata.

Exercises

Practice inside the lesson

30-45 minutesIntermediate

Trace the Blast Radius

Follow one changed source column through models, metrics, and consumers.

Start at raw_orders.amount
Map it to stg_orders.order_amount
Map it to fct_orders.gross_revenue
Map it to net_revenue
List impacted dashboards and owners

Recap

Key takeaways

Lineage makes data changes safer
dbt artifacts already contain useful dependency metadata
Column and metric lineage are more useful than table lineage alone

Related resources

Lineage with dbt Artifacts

Learning objectives

The Mental Model

Tiny Example

Interactive Check

Inline Practice Lab

Self-Check Quiz

Where this shows up

Keep these close

What usually breaks

Questions to answer before shipping

Vocabulary used in this module

Lineage

Manifest

Practice inside the lesson

Trace the Blast Radius

Key takeaways

Keep learning across CodersSecret

Related guides

Cheatsheets

Interactive labs

Glossary terms