DocForge

A Multi-Agent Retrieval-Augmented Generation (RAG) system built with LangGraph, featuring intelligent query routing, adaptive retrieval, fact-checking with automatic retry logic, and a FastAPI backend

Key Features

Multi-Agent Architecture

Routing Agent — Classifies query complexity (simple lookup / complex reasoning / multi-hop) and generates an optimized search query for the vector database
Retrieval Agent — Adaptive document retrieval (3-10 docs based on complexity, with relaxed thresholds on retries)
Analysis Agent — Synthesizes coherent, cited answers from multiple sources using chain-of-thought reasoning
Validation Agent — Fact-checks every claim against source documents, identifies hallucinations, and corrects the answer if needed

Intelligent Workflow

Confidence-based validation skip — When retrieval scores are high, sources are sufficient, and no information gaps exist, validation is skipped entirely for faster responses
Automatic retry with adaptive strategy — On validation failure, the system retries retrieval with 50% more documents and a relaxed relevance threshold (up to 3 attempts)
…

Component	Technology
Agent Orchestration	LangGraph
LLM Providers	OpenAI GPT-4o-mini (via OpenRouter), Google Gemini 2.5 Flash
Embeddings	OpenAI `text-embedding-3-small` (1536 dims)
Vector Database	Pinecone (serverless, cosine similarity)
Caching	Redis (SHA-256 keys, 1-hour TTL)
API Framework	FastAPI
LLM Framework	LangChain
Configuration	Pydantic Settings
Containerization	Docker + Docker Compose

I Built a Multi-Agent RAG System That Fact-Checks Its Own Answers — Here's How

ToheedAsghar / DocForge

A RAG pipeline that doesn't trust its own answers. 4 AI agents collaborate to route queries, retrieve docs, synthesize answers, and catch hallucinations automatically.

DocForge

Key Features

Multi-Agent Architecture

Intelligent Workflow

Why Traditional RAG Falls Short

The Architecture: Four Agents, One Pipeline

1. Routing Agent — The Dispatcher

2. Retrieval Agent — Adaptive Search

3. Analysis Agent — The Synthesizer

4. Validation Agent — The Fact-Checker

Smart Optimizations That Matter in Production

Confidence-Based Validation Skip

Redis Caching

Task-Specific Model Selection

Dual LLM Provider Support

Getting Started in 5 Minutes

Prerequisites

Installation

Configuration

Ingest Your Documents

Query the System

Or Use the REST API

The Tech Stack

What I Learned Building This

What's Next

Try It Out