Module 3: The dbt Mental Model

Understand sources, refs, models, DAGs, and materializations without setup friction.

90 minutes. 1 inline exercise. Free course module.

Learning Objectives

  • Explain how dbt compiles SQL models
  • Read a dbt DAG as a dependency graph
  • Know when a model should be a view, table, or incremental model

Why This Matters

dbt lets analytics engineers build data transformations as version-controlled SQL files. The dependency graph comes from source declarations and ref calls.

The dbt Mental Model Follow the arrows. Each box is one idea you will practice in this module. Source step 1 ref() step 2 Model step 3 DAG step 4 Build step 5 Production analytics engineering turns raw records into governed, trusted business meaning.
Architecture diagram for Module 3: The dbt Mental Model.

Lesson Content

The Mental Model

dbt lets analytics engineers build data transformations as version-controlled SQL files. The dependency graph comes from source declarations and ref calls.

Think of each dbt model as a recipe. ref() means "use the output of another recipe." dbt reads the recipes and decides the safe build order.

Tiny Example

We will use a small ecommerce dataset throughout the course. Think of these as the only tables in your first warehouse:

TableGrainExample columns
raw_ordersone row per order eventorder_id, customer_id, amount, status, created_at
raw_order_itemsone row per item inside an orderorder_id, product_id, quantity, item_price
raw_customersone row per customercustomer_id, email, country, created_at

Interactive Check

Question: If fct_orders uses ref("stg_orders"), which model must build first?

Reveal the answer

stg_orders must build first. The ref call creates a dependency edge from fct_orders back to stg_orders.

Inline Practice Lab

This lab is intentionally small. You can solve it by reading the table, writing the SQL/YAML mentally, or pasting the snippet into any SQL scratchpad later.

-- Example starter table
select
  order_id,
  customer_id,
  amount,
  status,
  created_at
from raw_orders;

The goal is not tooling setup. The goal is learning the production habit: state the grain, clean one thing, test one assumption, and explain the downstream impact.

Self-Check Quiz

  1. What is the grain of the table you are building?
  2. Which downstream metric or dashboard would be wrong if this model broke?
  3. What test would catch the most likely beginner mistake here?

Real-World Use Cases

  • Reliable executive dashboards that do not disagree across teams
  • AI analytics agents that query governed metrics instead of guessing SQL
  • Auditable metric changes where owners can see downstream impact before merge

Production Notes

  • Review DAG shape in pull requests. A messy graph usually predicts ownership and debugging pain.

Common Mistakes

  • Using raw tables directly in marts
  • Hardcoding schema names instead of using ref/source
  • Creating circular model dependencies

Think Like an Engineer

  • Can you explain the grain of this model in one sentence?
  • What breaks downstream if this field becomes null tomorrow?
  • Where should this logic live so it is reused instead of copied?

Career Relevance

Analytics engineering is the bridge between SQL skill and production data ownership. Freshers who learn tests, lineage, metrics, and semantic modeling early stand out because they can reason about trust, not just queries.

Key Terms

DAG
Directed acyclic graph; a dependency graph with no circular dependencies.
Materialization
How dbt stores a model, such as view, table, or incremental table.

Inline Exercises

  1. Order the dbt DAG

    Put shuffled dbt models into the correct build order.

    30-45 minutes - Beginner

    • Start with sources
    • Place staging models next
    • Place intermediate joins after staging
    • Place marts last
    • Explain why dashboards should read marts, not raw sources

    Inline lab: complete the exercise directly in the course page.

Key Takeaways

  • dbt is SQL plus dependency management, tests, docs, and deployment discipline
  • ref() creates maintainable dependencies
  • The DAG is your first lineage map