Module 6 of 16

Building Basic RAG Systems

The complete retrieve-augment-generate pipeline with source attribution and citations

3.5 hours2 labsFree

Start here

Learning objectives

  • Build a complete RAG pipeline from scratch
  • Implement context injection and prompt augmentation
  • Add source attribution and citations
  • Handle edge cases: no results, conflicting sources, long context
RAG PIPELINE: RETRIEVE → AUGMENT → GENERATEUser Querynatural languageEmbed Querysame model as docsRetrieve Top-Kvector DB searchAugment Promptinject contextGenerateLLM + citationsAnswer grounded in YOUR documents, with source citationsNot hallucinated. Verifiable. Domain-specific.

This is the module where everything comes together. You build the complete RAG pipeline: take a user question, embed it, retrieve relevant chunks, inject them into the prompt, and generate a grounded answer with citations.

The RAG Pipeline

import anthropic
from qdrant_client import QdrantClient
from sentence_transformers import SentenceTransformer

embedder = SentenceTransformer('all-MiniLM-L6-v2')
qdrant = QdrantClient(url="http://localhost:6333")
claude = anthropic.Anthropic()

def rag_answer(question: str) -> dict:
    # 1. Embed the query
    query_vector = embedder.encode(question).tolist()

    # 2. Retrieve relevant chunks
    results = qdrant.search(collection_name="docs", query_vector=query_vector, limit=5)

    # 3. Build context from retrieved chunks
    context_chunks = []
    for r in results:
        context_chunks.append(f"[Source: {r.payload['title']}]\n{r.payload['content']}")
    context = "\n\n---\n\n".join(context_chunks)

    # 4. Augment prompt with context
    response = claude.messages.create(
        model="claude-sonnet-4-6",
        max_tokens=1024,
        system="Answer using ONLY the provided context. Cite sources. If the context does not contain the answer, say so.",
        messages=[{"role": "user", "content": f"Context:\n{context}\n\nQuestion: {question}"}],
    )

    return {
        "answer": response.content[0].text,
        "sources": [r.payload['title'] for r in results],
    }

Source Attribution

Production RAG must cite its sources. This builds user trust and enables verification. Include source titles, page numbers, and relevance scores in the response.

Edge Cases

  • No relevant results: When retrieval returns nothing above the similarity threshold, say "I don't have information about this" instead of hallucinating
  • Conflicting sources: When retrieved documents disagree, present both perspectives with citations
  • Context overflow: When retrieved chunks exceed the context window, prioritize by relevance score

Common mistakes

What usually breaks

  • Not setting a similarity threshold — returning irrelevant chunks degrades quality
  • Including too many chunks — more context is not always better, it dilutes focus
  • Not instructing the model to cite sources — users cannot verify answers
  • Forgetting to handle the "no results" case — the model will hallucinate to fill the gap

Key terms

Vocabulary used in this module

Context Injection

Adding retrieved document chunks to the LLM prompt

Source Attribution

Citing which documents the answer was based on

Similarity Threshold

Minimum relevance score for a chunk to be included

Labs

Hands-on labs

40 minIntermediate

Build a Complete RAG Chatbot

Build an end-to-end RAG system with FastAPI.

  1. Ingest a document corpus into Qdrant
  2. Build the retrieve-augment-generate pipeline
  3. Expose as a FastAPI endpoint
  4. Test with domain-specific questions
View lab on GitHub
25 minIntermediate

Add Citations and Source Attribution

Make your RAG system cite its sources.

  1. Include source metadata in the prompt
  2. Parse citations from the LLM response
  3. Return sources with relevance scores
  4. Handle "no relevant information" gracefully
View lab on GitHub

Recap

Key takeaways

  • RAG pipeline: embed query → retrieve chunks → augment prompt → generate answer
  • Always include "answer ONLY from context" in the system prompt to reduce hallucination
  • Source attribution builds trust — cite document title, section, and relevance score
  • Handle edge cases: no results, conflicting sources, context overflow
  • This basic pipeline is the foundation — advanced techniques (Module 7+) improve quality

Related resources

Keep learning across CodersSecret