The Mental Model
Lineage is the map of how data flows. It helps you debug wrong numbers, assess change impact, and explain how a metric was produced.
Lineage is like a family tree for data. If one parent changes, you can see which children may be affected.
Tiny Example
We will use a small ecommerce dataset throughout the course. Think of these as the only tables in your first warehouse:
| Table | Grain | Example columns |
|---|---|---|
raw_orders | one row per order event | order_id, customer_id, amount, status, created_at |
raw_order_items | one row per item inside an order | order_id, product_id, quantity, item_price |
raw_customers | one row per customer | customer_id, email, country, created_at |
Interactive Check
Question: raw_orders.amount changes from dollars to cents. Which downstream objects might be impacted?
Reveal the answer
Any staging model using amount, any fact table deriving revenue, any revenue metric, and all dashboards or AI tools consuming that metric.
Inline Practice Lab
This lab is intentionally small. You can solve it by reading the table, writing the SQL/YAML mentally, or pasting the snippet into any SQL scratchpad later.
-- Example starter table
select
order_id,
customer_id,
amount,
status,
created_at
from raw_orders;
The goal is not tooling setup. The goal is learning the production habit: state the grain, clean one thing, test one assumption, and explain the downstream impact.
Self-Check Quiz
- What is the grain of the table you are building?
- Which downstream metric or dashboard would be wrong if this model broke?
- What test would catch the most likely beginner mistake here?