Skip to main content

Module 1: Introduction to AI & RAG Systems Slides

Slide walkthrough for Module 1 of Production-Grade RAG Systems Engineering: LLM fundamentals, hallucinations, and why retrieval-augmented generation...

This slide page is the visual review companion for the full course module. Use it to recap the architecture, examples, exercises, production warnings, and takeaways after reading the lesson.

Slide Outline

  1. Introduction to AI & RAG Systems - LLM fundamentals, hallucinations, and why retrieval-augmented generation changes everything
  2. Learning Objectives - 4 outcomes for this module
  3. Why This Module Matters - Every AI application that needs to answer questions about specific data — company docs, product manuals, legal contracts
  4. How LLMs Work (The 5-Minute Version) - Lesson section from the full module
  5. Tokens and Context Windows - Lesson section from the full module
  6. Why LLMs Hallucinate - Lesson section from the full module
  7. What RAG Changes - Lesson section from the full module
  8. Types of RAG Systems - Lesson section from the full module
  9. Real-World Use Cases - Customer support bots answering from product documentation, Legal AI searching case law and contracts
  10. Common Mistakes to Avoid - 4 mistakes covered
  11. Hands-On Labs - 2 hands-on labs
  12. Key Takeaways - 5 points to remember

Learning Objectives

  • Understand how LLMs work at a high level
  • Learn about tokens, context windows, and their limitations
  • Understand why LLMs hallucinate and how RAG solves it
  • Compare vanilla LLM vs RAG responses

Why This Module Matters

Every AI application that needs to answer questions about specific data — company docs, product manuals, legal contracts, medical records — needs RAG. Without it, your chatbot confidently makes things up. With it, your chatbot cites real sources. This is the foundation of every production AI system.

Common Mistakes

  • Building a chatbot without RAG and hoping the LLM knows your domain
  • Stuffing the entire document into the prompt instead of retrieving relevant chunks
  • Ignoring context window limits — overfilling the prompt degrades quality
  • Not evaluating retrieval quality — bad retrieval means bad answers regardless of the model

Key Takeaways

  • LLMs predict tokens based on training data — they do not know facts
  • Hallucinations happen because the model generates plausible text without verification
  • RAG retrieves relevant documents and injects them into the prompt before generation
  • Context windows limit how much data you can include — retrieval selects the most relevant
  • Three RAG levels: naive (demo), advanced (production), agentic (autonomous)

Hands-On Labs

  1. Run Your First LLM Application

    Build a simple Python app that calls an LLM API.

    20 min - Beginner

    • Install the Anthropic Python SDK
    • Send a basic prompt to Claude
    • Observe the response and token usage
    • Ask a question about recent events and observe hallucination

    View lab files on GitHub

  2. Compare Vanilla LLM vs RAG

    See the difference RAG makes on answer quality.

    25 min - Beginner

    • Ask the LLM a domain-specific question (without context)
    • Provide the same question with relevant document context
    • Compare accuracy, citations, and confidence
    • Discuss when RAG is necessary vs when vanilla LLM suffices

    View lab files on GitHub

Read the full module | Back to course curriculum