Deep-Dives / Retrieval & RAG
Retrieval & RAG
Retrieval-augmented generation past the basics — advanced architectures, graph RAG, retrieval as an agent tool, RAG security.
- Advanced RAG ArchitecturesThe naive→modular→agentic RAG spectrum and the levers that matter — CRAG, Self-RAG, query transformation, fusion, reranking — all attacking the same garbage-in/confident-wrong-out failure.
- GraphRAG & Multi-Hop RetrievalWhy flat top-k RAG cannot answer thematic or relational multi-hop queries, how Microsoft GraphRAG and iterative retrieve-reason loops solve it, and the cost/staleness heuristic for when not to.
- Hybrid Search & RerankingWhy one retriever is structurally not enough, reciprocal rank fusion across BM25 + dense, the two-stage retrieve-then-cross-encoder pattern, and ColBERT-style late interaction when cross-encoders are too slow.
- Document Parsing & Ingestion QualityIngestion is the half of RAG that doesn't get dashboards and decides whether the answer was ever indexable — layout-aware parsing, tables, OCR, structural chunking, and vision-RAG as a parsing escape hatch.
- Query Understanding & TransformationFixing the question before you search — rewriting, decomposition, multi-query, HyDE caveats, step-back prompting, and routing — the cheapest place to add intelligence to a RAG pipeline.
- Agentic Retrieval: Search as a ToolRetrieval as a tool the model calls iteratively in a ReAct-style loop, with budget, stopping criteria, and the new failure modes (looping, drift, premature stop) that come with handing the model the steering wheel.
- Evaluating RAGScore retrieval, grounding, and answer quality as three separate things — recall@k, faithfulness, answer relevance — plus how to build a small living eval set and use LLM judges without lying to yourself.
- Choosing a Vector DatabaseA constraint-first selection guide — what a vector DB actually is, the axes that genuinely differ between products (ANN algorithm, filter quality, freshness, hybrid, multi-tenancy, ops, cost), just enough HNSW/IVF/DiskANN internals to read a vendor pitch, and why most teams end at pgvector.