Configuration¶
The RAG Chatbot is configured through environment variables in the .env file. This guide covers all available settings.
Environment File Setup¶
Copy the example configuration:
Edit the configuration:
Database Settings¶
Configure the PostgreSQL connection:
# Database host (localhost for same server)
DB_HOST=localhost
# PostgreSQL port (default: 5432)
DB_PORT=5432
# Database name
DB_NAME=ragdb
# Database user
DB_USER=raguser
# Database password (use a strong password!)
DB_PASSWORD=your_secure_password
Security
Never commit .env files to version control. Add .env to your .gitignore file.
LLM Provider Settings¶
Choosing a Provider¶
The chatbot supports two LLM providers for generating responses:
OpenAI Configuration¶
Required for embeddings (always) and chat (when using OpenAI):
# Your OpenAI API key
OPENAI_API_KEY=sk-your-openai-api-key-here
# Chat model (used when LLM_PROVIDER=openai)
OPENAI_MODEL=gpt-4
Available models:
| Model | Description |
|---|---|
gpt-4 | Most capable, best for complex questions |
gpt-4-turbo | Faster, more cost-effective |
gpt-3.5-turbo | Fastest, lowest cost |
Embeddings
Embeddings always use OpenAI's text-embedding-3-small model, regardless of which chat provider you choose. This ensures consistent vector dimensions (1536) across your knowledge base.
Claude (Anthropic) Configuration¶
Required when using Claude for chat:
# Your Anthropic API key
ANTHROPIC_API_KEY=sk-ant-your-anthropic-api-key-here
# Claude model version
CLAUDE_MODEL=claude-sonnet-4-20250514
Available models:
| Model | Description |
|---|---|
claude-sonnet-4-20250514 | Balanced performance and cost |
claude-opus-4-20250514 | Most capable |
claude-3-haiku-20240307 | Fastest, most economical |
Docling Service Settings¶
Configure the document processing service:
# URL where Docling service is running
DOCLING_SERVICE_URL=http://localhost:8001
# Optional API key for authentication
DOCLING_API_KEY=
# Request timeout in seconds (increase for large files)
DOCLING_TIMEOUT=300
# Number of retry attempts on failure
DOCLING_MAX_RETRIES=3
Document Processing Options¶
# Output format: markdown, json, or text
DOCLING_DEFAULT_FORMAT=markdown
# Enable OCR for scanned documents
DOCLING_ENABLE_OCR=true
# Extract tables from documents
DOCLING_ENABLE_TABLES=true
# Analyze document layout/structure
DOCLING_ENABLE_LAYOUT=true
# Detect mathematical formulas
DOCLING_ENABLE_MATH=true
# OCR language (en, de, fr, es, zh, etc.)
DOCLING_OCR_LANGUAGE=en
Upload Settings¶
Configure file upload behavior:
# Maximum upload size in megabytes
MAX_UPLOAD_SIZE_MB=100
# Optional API key to protect upload endpoint
# Leave empty to allow public uploads
UPLOAD_API_KEY=
When UPLOAD_API_KEY is set, all upload requests must include the header:
Debug Settings¶
Production Warning
Debug features should be disabled in production as they expose sensitive information including API requests and responses.
# Enable API call logging (true/false)
API_DEBUG_LOGGING_ENABLED=false
# Directory for debug logs
API_DEBUG_LOG_PATH=/var/www/chatbot/logs/api_debug
# Credentials for debug interface
API_DEBUG_USERNAME=admin
API_DEBUG_PASSWORD=your_secure_password
# Optional: Restrict debug access to specific IPs
# Comma-separated list, supports CIDR notation
API_DEBUG_IP_WHITELIST=192.168.1.0/24,10.0.0.1
Complete Example¶
Here's a complete .env file for production:
# ===========================================
# Database Configuration
# ===========================================
DB_HOST=localhost
DB_PORT=5432
DB_NAME=ragdb
DB_USER=raguser
DB_PASSWORD=super_secure_password_123
# ===========================================
# LLM Provider
# ===========================================
LLM_PROVIDER=openai
# OpenAI (required for embeddings, optional for chat)
OPENAI_API_KEY=sk-proj-xxxxxxxxxxxxxxxxxxxx
OPENAI_MODEL=gpt-4
# Claude/Anthropic (optional, for chat only)
# ANTHROPIC_API_KEY=sk-ant-xxxxxxxxxxxxxxxxxxxx
# CLAUDE_MODEL=claude-sonnet-4-20250514
# ===========================================
# Docling Service
# ===========================================
DOCLING_SERVICE_URL=http://localhost:8001
DOCLING_API_KEY=
DOCLING_TIMEOUT=300
DOCLING_MAX_RETRIES=3
DOCLING_DEFAULT_FORMAT=markdown
DOCLING_ENABLE_OCR=true
DOCLING_ENABLE_TABLES=true
DOCLING_ENABLE_LAYOUT=true
DOCLING_ENABLE_MATH=true
DOCLING_OCR_LANGUAGE=en
# ===========================================
# Upload Settings
# ===========================================
MAX_UPLOAD_SIZE_MB=100
UPLOAD_API_KEY=
# ===========================================
# Debug (disable in production!)
# ===========================================
API_DEBUG_LOGGING_ENABLED=false
API_DEBUG_LOG_PATH=/var/www/chatbot/logs/api_debug
API_DEBUG_USERNAME=admin
API_DEBUG_PASSWORD=debug_password_123
API_DEBUG_IP_WHITELIST=
Validating Configuration¶
Test your configuration:
cd /var/www/chatbot
# Test database connection
php -r "
require 'vendor/autoload.php';
\$dotenv = Dotenv\Dotenv::createImmutable(__DIR__);
\$dotenv->load();
\$dsn = 'pgsql:host=' . \$_ENV['DB_HOST'] . ';dbname=' . \$_ENV['DB_NAME'];
\$pdo = new PDO(\$dsn, \$_ENV['DB_USER'], \$_ENV['DB_PASSWORD']);
echo 'Database: OK\n';
"
# Test OpenAI API key
curl https://api.openai.com/v1/models \
-H "Authorization: Bearer $(grep OPENAI_API_KEY .env | cut -d= -f2)"
Environment-Specific Configurations¶
Development¶
Staging¶
Production¶
Updating Configuration¶
After changing the .env file:
- No restart required - PHP reads the file on each request
- Clear opcache (if enabled) -
php -r "opcache_reset();" - Test the changes - Make a test API request
Next Steps¶
- Deploy the Docling service for document processing
- Learn about the chat interface
- Review security settings