Skip to content

Configuration

The RAG Chatbot is configured through environment variables in the .env file. This guide covers all available settings.

Environment File Setup

Copy the example configuration:

cd /var/www/chatbot
cp .env.example .env
chmod 600 .env  # Protect sensitive data

Edit the configuration:

nano .env

Database Settings

Configure the PostgreSQL connection:

# Database host (localhost for same server)
DB_HOST=localhost

# PostgreSQL port (default: 5432)
DB_PORT=5432

# Database name
DB_NAME=ragdb

# Database user
DB_USER=raguser

# Database password (use a strong password!)
DB_PASSWORD=your_secure_password

Security

Never commit .env files to version control. Add .env to your .gitignore file.

LLM Provider Settings

Choosing a Provider

The chatbot supports two LLM providers for generating responses:

# Options: "openai" or "claude"
LLM_PROVIDER=openai

OpenAI Configuration

Required for embeddings (always) and chat (when using OpenAI):

# Your OpenAI API key
OPENAI_API_KEY=sk-your-openai-api-key-here

# Chat model (used when LLM_PROVIDER=openai)
OPENAI_MODEL=gpt-4

Available models:

Model Description
gpt-4 Most capable, best for complex questions
gpt-4-turbo Faster, more cost-effective
gpt-3.5-turbo Fastest, lowest cost

Embeddings

Embeddings always use OpenAI's text-embedding-3-small model, regardless of which chat provider you choose. This ensures consistent vector dimensions (1536) across your knowledge base.

Claude (Anthropic) Configuration

Required when using Claude for chat:

# Your Anthropic API key
ANTHROPIC_API_KEY=sk-ant-your-anthropic-api-key-here

# Claude model version
CLAUDE_MODEL=claude-sonnet-4-20250514

Available models:

Model Description
claude-sonnet-4-20250514 Balanced performance and cost
claude-opus-4-20250514 Most capable
claude-3-haiku-20240307 Fastest, most economical

Docling Service Settings

Configure the document processing service:

# URL where Docling service is running
DOCLING_SERVICE_URL=http://localhost:8001

# Optional API key for authentication
DOCLING_API_KEY=

# Request timeout in seconds (increase for large files)
DOCLING_TIMEOUT=300

# Number of retry attempts on failure
DOCLING_MAX_RETRIES=3

Document Processing Options

# Output format: markdown, json, or text
DOCLING_DEFAULT_FORMAT=markdown

# Enable OCR for scanned documents
DOCLING_ENABLE_OCR=true

# Extract tables from documents
DOCLING_ENABLE_TABLES=true

# Analyze document layout/structure
DOCLING_ENABLE_LAYOUT=true

# Detect mathematical formulas
DOCLING_ENABLE_MATH=true

# OCR language (en, de, fr, es, zh, etc.)
DOCLING_OCR_LANGUAGE=en

Upload Settings

Configure file upload behavior:

# Maximum upload size in megabytes
MAX_UPLOAD_SIZE_MB=100

# Optional API key to protect upload endpoint
# Leave empty to allow public uploads
UPLOAD_API_KEY=

When UPLOAD_API_KEY is set, all upload requests must include the header:

X-API-Key: your-upload-api-key

Debug Settings

Production Warning

Debug features should be disabled in production as they expose sensitive information including API requests and responses.

# Enable API call logging (true/false)
API_DEBUG_LOGGING_ENABLED=false

# Directory for debug logs
API_DEBUG_LOG_PATH=/var/www/chatbot/logs/api_debug

# Credentials for debug interface
API_DEBUG_USERNAME=admin
API_DEBUG_PASSWORD=your_secure_password

# Optional: Restrict debug access to specific IPs
# Comma-separated list, supports CIDR notation
API_DEBUG_IP_WHITELIST=192.168.1.0/24,10.0.0.1

Complete Example

Here's a complete .env file for production:

# ===========================================
# Database Configuration
# ===========================================
DB_HOST=localhost
DB_PORT=5432
DB_NAME=ragdb
DB_USER=raguser
DB_PASSWORD=super_secure_password_123

# ===========================================
# LLM Provider
# ===========================================
LLM_PROVIDER=openai

# OpenAI (required for embeddings, optional for chat)
OPENAI_API_KEY=sk-proj-xxxxxxxxxxxxxxxxxxxx
OPENAI_MODEL=gpt-4

# Claude/Anthropic (optional, for chat only)
# ANTHROPIC_API_KEY=sk-ant-xxxxxxxxxxxxxxxxxxxx
# CLAUDE_MODEL=claude-sonnet-4-20250514

# ===========================================
# Docling Service
# ===========================================
DOCLING_SERVICE_URL=http://localhost:8001
DOCLING_API_KEY=
DOCLING_TIMEOUT=300
DOCLING_MAX_RETRIES=3
DOCLING_DEFAULT_FORMAT=markdown
DOCLING_ENABLE_OCR=true
DOCLING_ENABLE_TABLES=true
DOCLING_ENABLE_LAYOUT=true
DOCLING_ENABLE_MATH=true
DOCLING_OCR_LANGUAGE=en

# ===========================================
# Upload Settings
# ===========================================
MAX_UPLOAD_SIZE_MB=100
UPLOAD_API_KEY=

# ===========================================
# Debug (disable in production!)
# ===========================================
API_DEBUG_LOGGING_ENABLED=false
API_DEBUG_LOG_PATH=/var/www/chatbot/logs/api_debug
API_DEBUG_USERNAME=admin
API_DEBUG_PASSWORD=debug_password_123
API_DEBUG_IP_WHITELIST=

Validating Configuration

Test your configuration:

cd /var/www/chatbot

# Test database connection
php -r "
require 'vendor/autoload.php';
\$dotenv = Dotenv\Dotenv::createImmutable(__DIR__);
\$dotenv->load();
\$dsn = 'pgsql:host=' . \$_ENV['DB_HOST'] . ';dbname=' . \$_ENV['DB_NAME'];
\$pdo = new PDO(\$dsn, \$_ENV['DB_USER'], \$_ENV['DB_PASSWORD']);
echo 'Database: OK\n';
"

# Test OpenAI API key
curl https://api.openai.com/v1/models \
  -H "Authorization: Bearer $(grep OPENAI_API_KEY .env | cut -d= -f2)"

Environment-Specific Configurations

Development

API_DEBUG_LOGGING_ENABLED=true
OPENAI_MODEL=gpt-3.5-turbo  # Lower cost for testing

Staging

API_DEBUG_LOGGING_ENABLED=true
API_DEBUG_IP_WHITELIST=10.0.0.0/8
OPENAI_MODEL=gpt-4

Production

API_DEBUG_LOGGING_ENABLED=false
UPLOAD_API_KEY=randomly_generated_key
OPENAI_MODEL=gpt-4

Updating Configuration

After changing the .env file:

  1. No restart required - PHP reads the file on each request
  2. Clear opcache (if enabled) - php -r "opcache_reset();"
  3. Test the changes - Make a test API request

Next Steps

  1. Deploy the Docling service for document processing
  2. Learn about the chat interface
  3. Review security settings