Skip to main content

Module 3: Event-Driven & Asynchronous Systems Slides

Slide walkthrough for Module 3 of Distributed Systems Engineering: Building Scalable, Reliable & Secure Systems: How Kafka, RabbitMQ, NATS, and pub/sub...

This slide page is the visual review companion for the full course module. Use it to recap the architecture, examples, exercises, production warnings, and takeaways after reading the lesson.

Slide Outline

  1. Event-Driven & Asynchronous Systems - How Kafka, RabbitMQ, NATS, and pub/sub patterns let services decouple in time and scale — and the failure modes that come with them.
  2. Learning Objectives - 5 outcomes for this module
  3. Why This Module Matters - Async event pipelines are how every modern company scales beyond the synchronous-RPC limits of microservices. The teams
  4. Before vs After - The operational shift this module teaches
  5. Queue vs Pub/Sub vs Event Streaming - Pick One Deliberately - Lesson section from the full module
  6. Kafka in Production - Lesson section from the full module
  7. Delivery Guarantees - What Exactly-Once Actually Means - Lesson section from the full module
  8. Backpressure - Lesson section from the full module
  9. Common Production Failures - Lesson section from the full module
  10. Backpressure Propagation - Lesson section from the full module
  11. Self-Check Quiz - Lesson section from the full module
  12. Real-World Use Cases - LinkedIn (Kafka's birthplace) processes trillions of events per day across thousands of topics., Slack uses Kafka to fan out every message event to many internal consumers (search indexing, push notifications, analytics).
  13. Common Mistakes to Avoid - 3 mistakes covered
  14. Production Notes - 4 practical notes
  15. Security Risks to Watch - 4 risks covered
  16. Hands-On Labs - 3 hands-on labs
  17. Key Takeaways - 5 points to remember

Learning Objectives

  • Choose between message queues, pub/sub, and event streaming for a given workload
  • Reason about partitioning, ordering, and consumer groups in Kafka
  • Implement backpressure correctly so producers do not melt consumers
  • Design exactly-once semantics where you actually need them — and at-least-once where you do not
  • Diagnose the canonical event-pipeline outages: lag spikes, rebalances, and stuck consumers

Why This Module Matters

Async event pipelines are how every modern company scales beyond the synchronous-RPC limits of microservices. The teams that get event streams right ship features 3x faster (independent producers and consumers, no tight coupling) and survive failures better (decoupled in time means downstream slow does not block upstream fast). The teams that get them wrong end up with stuck consumers, lost messages, exactly-once theatre, and data loss they only discover during a regulatory audit.

Production Notes

  • Always bound queues. Unbounded in-memory queues are delayed OOMs.
  • Default to at-least-once + consumer-side idempotency. Reach for exactly-once only when you genuinely cannot make consumers idempotent.
  • Watch consumer lag as a first-class metric. Lag spikes precede every event-pipeline incident.
  • Tune Kafka session timeouts and heartbeat intervals carefully — too aggressive triggers rebalance storms.

Common Mistakes

  • Choosing exactly-once because it sounds safer; ignoring the operational cost and accepting the false sense of security.
  • Single-partition topics for “simplicity”; they cap consumer parallelism at 1 and become bottlenecks.
  • Ignoring partition keys; uniform random keys feel safe but break per-entity ordering.

Key Takeaways

  • Async decouples in time, not in contract — the contract still has to be designed carefully
  • Pick queue / pub-sub / event-streaming based on whether you need work-distribution, fan-out, or replayable log
  • Default to at-least-once + consumer-side idempotency; reach for exactly-once only with reason
  • Always bound your queues; unbounded is a delayed OOM
  • Watch consumer lag as a first-class metric; it is the early warning of every event-pipeline incident

Hands-On Labs

  1. Lab 3.1 — Kafka Event Pipeline End-to-End

    Build a producer/consumer pipeline with proper key-based partitioning and consumer groups.

    90 minutes - Intermediate

    • Spin up Kafka via docker-compose
    • Write a producer that emits orders keyed by user_id
    • Write two consumer groups (billing, audit) reading the same topic
    • Verify per-key ordering on each partition
    • Kill a broker and verify durability

    View lab files on GitHub

  2. Lab 3.2 — Backpressure in a Reactive Pipeline

    Reproduce a runaway producer; introduce bounded queues; observe stable throughput.

    60 minutes - Intermediate

    • Implement an unbounded in-memory queue between producer and consumer
    • Run with producer at 10x consumer rate; watch memory grow
    • Replace with a bounded queue
    • Observe blocking on the producer; system stabilises

    View lab files on GitHub

  3. Lab 3.3 — Idempotent Consumer with Dedup

    Design a consumer that processes at-least-once messages exactly once via idempotency keys.

    60 minutes - Intermediate

    • Use a Postgres table with unique constraint on idempotency_key
    • Process orders; on duplicate, ignore
    • Inject duplicate messages; verify only one effect per key

    View lab files on GitHub

Read the full module | Back to course curriculum