RAG Chatbot¶

Welcome to the RAG Chatbot documentation. This guide covers installation, configuration, and usage of the Retrieval-Augmented Generation (RAG) chatbot system.

Overview¶

RAG Chatbot is an intelligent document-based chat system that allows users to ask questions and receive accurate answers based on your uploaded content. It combines the power of large language models (LLMs) with a searchable knowledge base built from your documents.

Key Features¶

Intelligent Q&A - Ask questions in natural language and get answers sourced from your documents
Multiple LLM Providers - Choose between OpenAI (GPT-5.1) or Anthropic (Claude) for chat responses
Document Support - Upload PDFs, Word documents, PowerPoints, spreadsheets, images, and audio files
Media Import - Import transcripts from YouTube, Vimeo, SoundCloud, and 1000+ other platforms
Hybrid Search - Combines semantic (AI) search with keyword matching for accurate results
Conversation Memory - Maintains context across long conversations with automatic summarization
Embeddable Widget - Drop-in chat widget for any website

How It Works¶

┌─────────────┐     ┌──────────────┐     ┌─────────────┐
│   Upload    │────▶│   Process    │────▶│   Store     │
│  Documents  │     │  & Chunk     │     │  Vectors    │
└─────────────┘     └──────────────┘     └─────────────┘
                                                │
                                                ▼
┌─────────────┐     ┌──────────────┐     ┌─────────────┐
│   Answer    │◀────│   Retrieve   │◀────│   Query     │
│   User      │     │   Context    │     │   Database  │
└─────────────┘     └──────────────┘     └─────────────┘

Upload - Documents are uploaded through the web interface or API
Process - The Docling service parses documents and extracts text
Store - Text is split into chunks and stored with vector embeddings
Query - User questions are converted to vectors and matched against stored content
Retrieve - The most relevant chunks are retrieved using hybrid search
Answer - An LLM generates a response based on the retrieved context

System Requirements¶

Component	Minimum	Recommended
CPU	4 cores	8+ cores
RAM	8 GB	16+ GB
Storage	50 GB SSD	200+ GB SSD
GPU	None (CPU mode)	NVIDIA with 8+ GB VRAM

Supported Platforms¶

Operating System: AlmaLinux 9, RHEL 9, Rocky Linux 9, Ubuntu 22.04+
Cloud: AWS EC2, Google Cloud Compute Engine, Azure VMs
Container: Docker, Podman

Quick Start¶

For a quick installation, follow these steps:

Install the system requirements on AlmaLinux 9
Set up PostgreSQL 16 with pgvector
Configure the application
Deploy the Docling service
Start using the chat interface

Architecture¶

The system consists of three main components:

PHP Application¶

The main web application handles:

Chat API endpoints
Document upload and management
User session management
Communication with LLM providers

PostgreSQL Database¶

Stores all persistent data:

Document metadata and content chunks
Vector embeddings for semantic search
Chat session history
Full-text search indexes

Docling Service¶

A Python microservice that handles:

Document parsing (PDF, DOCX, images, etc.)
OCR for scanned documents
Table and layout extraction
Media downloading and transcription

Support¶

If you encounter issues:

Check the Troubleshooting guide
Review application logs in /var/www/chatbot/logs/
Use the Debug RAG endpoint to diagnose search issues

Next Steps¶

Ready to get started? Head to the Installation Guide to set up the system on AlmaLinux 9.