Production Analytics Engineering with dbt: Metrics, Semantic Layers & Lineage

Learn analytics engineering from scratch: dbt models, table grain, staging, marts, tests, freshness, metrics, semantic layers, MetricFlow, lineage, CI/CD...

What You Will Learn

A beginner-friendly production analytics engineering course for freshers. You will learn how modern data teams transform raw warehouse tables into tested dbt models, governed metrics, semantic layer definitions, and lineage-aware data products. The course uses inline SQL/YAML exercises, diagrams, quizzes, and revealable answers so learners can practice without setting up a GitHub repository.

16 modules, 16 inline exercises, 28+ hours, Beginner to Intermediate, 100% free.

  • Freshers who know basic SQL and want to enter data engineering or analytics engineering
  • Backend engineers moving toward data platform work
  • Data analysts who want software-engineering discipline with dbt
  • Students who get confused by warehouse, dbt, metrics, and semantic layer terminology
  • Junior data engineers who want to build trustworthy models, not just pipelines
  • AI builders who need governed data and metrics before using LLMs over warehouse data

Full Curriculum

  1. Module 1: What Analytics Engineering Actually Is

    Understand the job: turn raw tables into trusted business meaning. 75 minutes. 1 inline exercise.

    • Explain analytics engineering in beginner-friendly language
    • Separate data engineering, analytics engineering, and BI work
    • Understand why trust matters more than query cleverness
  2. Module 2: Tables, Grain, and Why Dashboards Lie

    Learn the most important beginner concept: one row per what? 90 minutes. 1 inline exercise.

    • Define table grain accurately
    • Spot double-counting bugs before they reach dashboards
    • Understand facts, dimensions, and event tables
  3. Module 3: The dbt Mental Model

    Understand sources, refs, models, DAGs, and materializations without setup friction. 90 minutes. 1 inline exercise.

    • Explain how dbt compiles SQL models
    • Read a dbt DAG as a dependency graph
    • Know when a model should be a view, table, or incremental model
  4. Module 4: Staging Models

    Clean source data gently: rename, cast, standardize, and expose a stable base layer. 100 minutes. 1 inline exercise.

    • Build staging models that stay close to the source
    • Apply safe renaming and type casting
    • Avoid burying business logic too early
  5. Module 5: Intermediate Models

    Build reusable transformation steps without exposing half-finished business tables. 95 minutes. 1 inline exercise.

    • Know when to create an intermediate model
    • Separate reusable logic from final reporting shape
    • Reduce duplication across marts
  6. Module 6: Marts: Facts and Dimensions

    Create the business-facing layer: facts, dimensions, and star schemas. 110 minutes. 1 inline exercise.

    • Design simple fact and dimension tables
    • Understand star schema basics
    • Choose the right mart grain for reporting
  7. Module 7: Testing and Data Quality

    Use tests to catch broken assumptions before users lose trust. 110 minutes. 1 inline exercise.

    • Use not_null, unique, relationships, and accepted_values tests
    • Write testable assumptions in model YAML
    • Connect data quality to user trust
  8. Module 8: Freshness, Contracts, and Documentation

    Make data understandable, current, and safe to change. 95 minutes. 1 inline exercise.

    • Explain source freshness and data SLAs
    • Document models and columns clearly
    • Understand model contracts and ownership
  9. Module 9: Incremental Models and Backfills

    Scale transformations without losing correctness when old data changes. 120 minutes. 1 inline exercise.

    • Understand full refresh vs incremental builds
    • Handle late-arriving data
    • Reason about backfills and idempotency
  10. Module 10: Metrics as Product APIs

    Treat revenue, active users, retention, and conversion as governed interfaces. 105 minutes. 1 inline exercise.

    • Define a production metric specification
    • Separate measures from metrics
    • Understand why metrics need owners and change policies
  11. Module 11: Semantic Layer Fundamentals

    Learn entities, measures, dimensions, and the semantic graph. 110 minutes. 1 inline exercise.

    • Explain the purpose of a semantic layer
    • Map business questions to semantic objects
    • Understand how semantic layers protect consistency
  12. Module 12: MetricFlow and the dbt Semantic Layer

    See how dbt semantic models produce governed SQL at query time. 115 minutes. 1 inline exercise.

    • Understand semantic model YAML at a high level
    • Know what MetricFlow does
    • Explain how governed metrics can serve BI, apps, and AI
  13. Module 13: Lineage with dbt Artifacts

    Trace impact from source columns to models, metrics, dashboards, and AI answers. 120 minutes. 1 inline exercise.

    • Explain table, column, metric, and operational lineage
    • Know what dbt manifest, run_results, and catalog artifacts contain
    • Use lineage to reason about blast radius
  14. Module 14: Data Incidents and Debugging

    Debug wrong revenue, stale data, broken joins, and schema drift like an engineer. 110 minutes. 1 inline exercise.

    • Classify common data incidents
    • Use tests and lineage during debugging
    • Write a useful data incident review
  15. Module 15: CI/CD for Analytics Engineering

    Prevent broken models and metric changes from reaching production silently. 105 minutes. 1 inline exercise.

    • Understand analytics CI checks
    • Use slim CI thinking for changed models
    • Design review rules for metric and semantic changes
  16. Module 16: Capstone: Build a Trusted Analytics Layer

    Design the full flow from raw ecommerce tables to governed metrics and lineage. 2 hours. 1 inline exercise.

    • Design an end-to-end analytics layer
    • Apply dbt, tests, metrics, semantic modeling, and lineage together
    • Produce a portfolio-ready architecture explanation

Course Topics

Analytics Engineering, dbt, Semantic Layer, MetricFlow, Metrics Layer, Data Lineage, Data Quality, Data Modeling, SQL, Data Engineering, Data Contracts, Data Observability, CI/CD, AI Analytics, Business Intelligence

Instructor

Vishal Anand

Senior Product Engineer & Tech Lead

Creator of DRF API Logger and author of production-focused CodersSecret courses. Vishal teaches engineering through concrete systems, diagrams, operational failures, and practical tradeoffs.

  • Creator of DRF API Logger, used across production Django systems
  • Author of free CodersSecret courses on security, distributed systems, and production AI
  • Writes practical engineering guides for backend, DevOps, security, and data systems
  • Focuses on beginner-friendly explanations without hiding production realities

Frequently Asked Questions

Is this course beginner-friendly?

Yes. It starts with tables, grain, and simple SQL mental models before introducing dbt, semantic layers, and lineage. Every module has a small interactive exercise and answer reveal.

Do I need a GitHub repository or local setup?

No. The first version uses inline labs inside the course pages. Optional downloadable datasets or a starter dbt project can be added later, but the course is useful without setup.

Is this only a dbt course?

No. dbt is the transformation tool used for examples, but the course is about production analytics engineering: modeling, quality, metrics, semantic layers, lineage, CI/CD, and data trust.

Will this help with data engineering roles?

Yes. It teaches the analytics engineering side of data engineering: warehouse modeling, transformation quality, metric governance, and lineage. It pairs well with a future lakehouse or streaming course.

Why include semantic layers and metrics?

Modern BI and AI analytics need governed definitions. Without a semantic layer or metrics layer, every dashboard or AI query can calculate business terms differently.

What should I know before starting?

Basic SQL helps, but the course explains the data modeling concepts slowly. You do not need prior dbt, Airflow, Spark, or warehouse experience.