Deep-Dives / Retrieval & RAG

Retrieval & RAG

Retrieval-augmented generation past the basics — advanced architectures, graph RAG, retrieval as an agent tool, RAG security.

Advanced RAG Architectures

The naive→modular→agentic RAG spectrum and the levers that matter — CRAG, Self-RAG, query transformation, fusion, reranking — all attacking the same garbage-in/confident-wrong-out failure.
GraphRAG & Multi-Hop Retrieval

Why flat top-k RAG cannot answer thematic or relational multi-hop queries, how Microsoft GraphRAG and iterative retrieve-reason loops solve it, and the cost/staleness heuristic for when not to.
Hybrid Search & Reranking

Why one retriever is structurally not enough, reciprocal rank fusion across BM25 + dense, the two-stage retrieve-then-cross-encoder pattern, and ColBERT-style late interaction when cross-encoders are too slow.
Document Parsing & Ingestion Quality

Ingestion is the half of RAG that doesn't get dashboards and decides whether the answer was ever indexable — layout-aware parsing, tables, OCR, structural chunking, and vision-RAG as a parsing escape hatch.
Query Understanding & Transformation

Fixing the question before you search — rewriting, decomposition, multi-query, HyDE caveats, step-back prompting, and routing — the cheapest place to add intelligence to a RAG pipeline.
Agentic Retrieval: Search as a Tool

Retrieval as a tool the model calls iteratively in a ReAct-style loop, with budget, stopping criteria, and the new failure modes (looping, drift, premature stop) that come with handing the model the steering wheel.
Evaluating RAG

Score retrieval, grounding, and answer quality as three separate things — recall@k, faithfulness, answer relevance — plus how to build a small living eval set and use LLM judges without lying to yourself.
Choosing a Vector Database

A constraint-first selection guide — what a vector DB actually is, the axes that genuinely differ between products (ANN algorithm, filter quality, freshness, hybrid, multi-tenancy, ops, cost), just enough HNSW/IVF/DiskANN internals to read a vendor pitch, and why most teams end at pgvector.