Case study · 2024
Yukti: a workable, end-to-end RAG stack
A complete RAG application built to be understood: ingestion, embeddings, retrieval, an evaluation harness, and a UI. Small enough to read, real enough to deploy.
Author · Maintainer
Stack
Python · FastAPI · ARQ · Unstructured.io · Docling · pgvector · Qdrant · LiteLLM · Vite · React · Tailwind · Docker
Outcomes
- End-to-end reference implementation people can clone, read, and ship from.
- Companion frontend and backend repos kept intentionally small so the architecture is the documentation.
- Multi-tenant RAG framework with semantic chunking, multi-LLM eval harness, and a real operator console.
What I owned
Architecture, code, docs, and review. Yukti is a deliberately small RAG application; its job is to be readable end-to-end so someone can understand what a production RAG system actually looks like (and not just the part the framework demo shows).
What shipped
A FastAPI-based backend that handles document ingestion, chunking, embedding, retrieval, and answer synthesis; a Next.js frontend that consumes the same API; and the connective tissue (Docker, env, eval scripts) you’d need to actually run it. The split between yukti and yukti-frontend keeps each repo focused.
Lessons
Most “RAG frameworks” are actually opinions hiding as APIs. The fastest way to teach RAG is to show the whole loop in code small enough to fit in your head, and then let people swap pieces out one at a time as they outgrow the defaults.