Skip to content

Security & Rate Limiting

This guide covers security features, rate limiting, and hardening recommendations for the RAG Chatbot.

Security Overview

The RAG Chatbot includes multiple security layers:

Layer Protection
Tenant Isolation Separate databases and configurations per tenant
Input Validation XSS, SQL injection, prompt injection
File Upload Path traversal, malicious files, code injection
URL Validation SSRF attacks, internal network access
Rate Limiting DoS attacks, brute force
Authentication API keys, HTTP Basic Auth

Multi-Tenant Security

Data Isolation

Each tenant has complete data isolation:

  • Separate Databases - Each tenant's data is stored in its own PostgreSQL database
  • Separate Configurations - Each tenant has its own .env file with unique API keys
  • No Cross-Tenant Access - Requests can only access data for the specified tenant

API Key Management

Each tenant has unique API keys:

Key Purpose
ADMIN_API_KEY Admin dashboard and management endpoints
CHAT_API_KEY Chat endpoints (if authentication required)
UPLOAD_API_KEY Upload and media-info endpoint protection (required)

Generate secure keys:

openssl rand -base64 24

Regenerate keys for a tenant:

php cli/tenant.php regenerate-keys my-tenant

IP Blocklist

The system maintains a shared IP blocklist for tenant resolution abuse:

  • After 10 failed attempts to resolve a tenant, the IP is blocked for 15 minutes
  • Blocklist data is stored in /var/www/chatbot/data/ip_blocklist/
  • This prevents brute-force tenant enumeration

Tenant Configuration Security

Protect tenant .env files:

# Set proper ownership (web server user)
sudo chown apache:apache /var/www/chatbot/tenants/*/
sudo chown apache:apache /var/www/chatbot/tenants/*/.env

# Restrict permissions
sudo chmod 750 /var/www/chatbot/tenants/*/
sudo chmod 640 /var/www/chatbot/tenants/*/.env

Rate Limiting

How It Works

Rate limiting uses a sliding window algorithm:

  1. Each request is timestamped
  2. Old timestamps (outside the window) are pruned
  3. If remaining requests > 0, the request proceeds
  4. Otherwise, HTTP 429 is returned with Retry-After header

Endpoint Limits

Endpoint Limit Window Identifier
/chat 30 requests 60 seconds Session ID + IP
/customer-chat 30 requests 60 seconds Session ID + IP
/upload 30 requests 60 seconds IP address
/media-info 30 requests 60 seconds IP address
/widget-config 60 requests 60 seconds IP address
/admin/* 120 requests 60 seconds IP address
/api-debug/logs 5 auth attempts 60 seconds IP address
CORS mismatch 10 requests 5 minutes IP address

Response Headers

All rate-limited endpoints return these headers:

X-RateLimit-Limit: 30
X-RateLimit-Remaining: 25
X-RateLimit-Reset: 1706889600

When the limit is exceeded:

HTTP/1.1 429 Too Many Requests
Retry-After: 45
Content-Type: application/json

{"error": "Rate limit exceeded", "retry_after": 45}

Customizing Limits

To adjust rate limits, modify the constants in the controller files:

Chat endpoint (src/Http/ChatController.php):

private const MAX_REQUESTS_PER_MINUTE = 30;
private const RATE_LIMIT_WINDOW = 60;
private const MAX_MESSAGE_LENGTH = 10000;

Upload endpoint (src/Http/UploadController.php):

private const MAX_REQUESTS_PER_MINUTE = 30;
private const RATE_LIMIT_WINDOW = 60;

Authentication

Admin API Key

The admin dashboard and management endpoints require the ADMIN_API_KEY:

  1. Set the key in your tenant's .env:
ADMIN_API_KEY=your-random-admin-key
  1. Include the key in requests:
curl -X GET "https://your-domain.com/admin/documents" \
  -H "X-Tenant-ID: my-tenant" \
  -H "X-API-Key: your-random-admin-key"

Upload API Key

The upload and media-info endpoints require the UPLOAD_API_KEY:

  1. Set the key in your tenant's .env:
UPLOAD_API_KEY=your-random-secret-key
  1. Include the key in requests:
curl -X POST "https://your-domain.com/upload?tenant=my-tenant" \
  -H "X-API-Key: your-random-secret-key" \
  -F "file=@document.pdf"

Generate a secure key:

openssl rand -base64 24

Required

If UPLOAD_API_KEY is not set, both /upload and /media-info return 401 Unauthorized.

Debug Endpoint Authentication

The API debug endpoint uses HTTP Basic Authentication:

  1. Configure in your tenant's .env:
API_DEBUG_USERNAME=admin
API_DEBUG_PASSWORD=very-secure-password
  1. Access with credentials:
curl -u admin:very-secure-password \
  https://your-domain.com/api-debug/logs

IP Whitelist

Restrict debug access to specific IPs:

# Single IP
API_DEBUG_IP_WHITELIST=192.168.1.100

# Multiple IPs
API_DEBUG_IP_WHITELIST=192.168.1.100,10.0.0.50

# CIDR notation
API_DEBUG_IP_WHITELIST=192.168.1.0/24,10.0.0.0/8

Lockout Protection

After 10 failed authentication attempts, the IP is locked out for 5 minutes:

{
  "error": "Too many failed attempts. Please try again later.",
  "retry_after": 300
}

CORS Origin Whitelist

Cross-origin requests are controlled via the ALLOWED_ORIGINS environment variable:

# Comma-separated list of allowed origins
ALLOWED_ORIGINS=https://your-site.com,https://app.your-site.com

Behavior:

  • Only origins listed in ALLOWED_ORIGINS receive CORS headers
  • If ALLOWED_ORIGINS is empty, no Access-Control-Allow-Origin header is sent (blocks cross-origin)
  • IPs that repeatedly send unrecognized origins are rate-limited (10 per 5 minutes)
  • After exceeding the limit, the IP receives 403 Forbidden
  • Access-Control-Max-Age is set to 3600 seconds (1 hour)

Tenant Prompt Security

Tenants can customize the LLM system prompt via the admin dashboard or API. Security measures prevent abuse:

Architecture

The system prompt is composed of two parts:

  1. Security prompt (hardcoded, non-overridable) - Enforces rules like treating context as data, not following injected instructions
  2. Tenant prompt (customizable) - Controls LLM behavior and personality, appended after security rules with a clear delimiter

Validation

When saving a tenant prompt:

  • Max length: 10,000 characters
  • Injection detection: Prompts containing patterns like "ignore previous instructions", "system prompt:", "you are now", etc. are rejected
  • Content sanitization: The ContentSanitizer class scans for known injection patterns

Input Validation

Message Sanitization

User messages are sanitized to prevent prompt injection:

  • Suspicious patterns are wrapped in [TEXT: ...] markers
  • LLM is instructed to treat context as data, not commands
  • Context is separated with clear delimiters

Detected patterns include:

  • "Ignore previous instructions"
  • "System prompt" references
  • Role-playing attempts ("You are now...")
  • Code injection attempts

File Upload Validation

Uploaded files are validated for:

Check Protection
Extension whitelist Block executable files
Magic byte verification Detect disguised files
PHP code detection Prevent code execution
Path traversal Block ../ in filenames
Null byte injection Block %00 attacks
Size limits Prevent DoS via large files

File Size Limit Implementation:

The maximum upload size is controlled by the MAX_UPLOAD_SIZE_MB environment variable (default: 100 MB).

  • Configuration: Set in tenant's .env file as MAX_UPLOAD_SIZE_MB=100
  • Read in: public/upload:615 - converts MB to bytes
  • Enforced by: src/Services/FileUploadValidator.php:88 - rejects oversized files
// upload.php - Line 615
$maxSize = (int) ($_ENV['MAX_UPLOAD_SIZE_MB'] ?? 100) * 1024 * 1024;

// FileUploadValidator.php - Line 88
if ($file['size'] > $maxSizeBytes) {
    return ['valid' => false, 'error' => 'File size exceeds limit'];
}

PHP Configuration

Also ensure your PHP settings allow large uploads:

upload_max_filesize = 100M
post_max_size = 105M

Blocked extensions:

php, phtml, php3, php4, php5, php7, phar
exe, bat, cmd, sh, bash, ps1
js, vbs, wsf, hta

URL Validation (SSRF Protection)

Media URLs are validated to prevent SSRF attacks:

Check Protection
Protocol whitelist Only HTTP/HTTPS allowed
Private IP blocking No 10.x, 172.16.x, 192.168.x
Localhost blocking No 127.x or localhost
Cloud metadata blocking No 169.254.169.254
DNS rebinding protection Verify resolved IP
Port restrictions Block sensitive ports

Security Headers

All responses include security headers:

X-Content-Type-Options: nosniff
X-Frame-Options: DENY
X-XSS-Protection: 1; mode=block
Referrer-Policy: strict-origin-when-cross-origin

The X-Powered-By header is removed to hide PHP version. Security headers are set early in request processing (before tenant resolution) so they appear on all responses, including error paths.

Adding More Headers

For additional protection, add headers in Apache:

# In httpd.conf or .htaccess
Header always set Content-Security-Policy "default-src 'self'"
Header always set Strict-Transport-Security "max-age=31536000; includeSubDomains"
Header always set Permissions-Policy "geolocation=(), microphone=(), camera=()"

Or in Nginx:

add_header Content-Security-Policy "default-src 'self'";
add_header Strict-Transport-Security "max-age=31536000; includeSubDomains";
add_header Permissions-Policy "geolocation=(), microphone=(), camera=()";

Conversation Limits

To prevent DoS through resource exhaustion:

Limit Value Purpose
Message length 10,000 chars Prevent oversized requests
Total messages 500 per session Limit database growth
Summary length 8,000 chars Cap memory usage

When limits are reached, users are prompted to start a new conversation.


Production Hardening

Disable Debug Features

In production, always disable:

API_DEBUG_LOGGING_ENABLED=false

Use HTTPS

Always use TLS in production:

# Install certbot
sudo dnf install certbot python3-certbot-apache

# Obtain certificate
sudo certbot --apache -d your-domain.com

# Auto-renewal
sudo systemctl enable certbot-renew.timer

Database Security

  1. Use strong passwords - Generate with openssl rand -base64 32
  2. Restrict access - Only allow connections from web server
  3. Encrypt connections - Enable SSL in PostgreSQL
# In postgresql.conf
ssl = on
ssl_cert_file = '/path/to/server.crt'
ssl_key_file = '/path/to/server.key'

File Permissions

# Secure tenant .env files
chmod 640 /var/www/chatbot/tenants/*/.env
chown apache:apache /var/www/chatbot/tenants/*/.env

# Secure tenant directories
chmod 750 /var/www/chatbot/tenants/*/
chown apache:apache /var/www/chatbot/tenants/*/

# Restrict log directory
chmod 750 /var/www/chatbot/logs
chown apache:apache /var/www/chatbot/logs

# Make application files read-only
find /var/www/chatbot -type f -exec chmod 644 {} \;
find /var/www/chatbot -type d -exec chmod 755 {} \;

SELinux (RHEL/AlmaLinux)

Keep SELinux enabled and configure properly:

# Allow network connections
setsebool -P httpd_can_network_connect 1
setsebool -P httpd_can_network_connect_db 1

# Set proper contexts
semanage fcontext -a -t httpd_sys_rw_content_t "/var/www/chatbot/logs(/.*)?"
restorecon -Rv /var/www/chatbot/logs

Firewall Rules

Only expose necessary ports:

# AlmaLinux/RHEL (firewalld)
sudo firewall-cmd --permanent --add-service=http
sudo firewall-cmd --permanent --add-service=https
sudo firewall-cmd --reload

# Ubuntu (ufw)
sudo ufw allow 80/tcp
sudo ufw allow 443/tcp
sudo ufw enable

# CSF (ConfigServer Firewall)
# Edit /etc/csf/csf.conf and ensure 80,443 are in TCP_IN:
sudo nano /etc/csf/csf.conf
#   TCP_IN = "20,21,22,25,53,80,110,143,443,465,587,993,995"

# Restart CSF to apply changes
sudo csf -r

# Verify ports are open
sudo csf -p

Monitoring & Logging

Application Logs

Errors are logged to:

  • PHP error log: /var/log/php-fpm/www-error.log
  • Apache error log: /var/log/httpd/chatbot-error.log
  • Application logs: /var/www/chatbot/logs/

Security Monitoring

Monitor for:

  1. Rate limit triggers - Excessive 429 responses
  2. Failed auth attempts - Watch API debug auth logs
  3. Unusual file uploads - Large files or unusual types
  4. Error spikes - May indicate attack attempts

Log Rotation

Configure logrotate for application logs:

# /etc/logrotate.d/chatbot
/var/www/chatbot/logs/*.log {
    daily
    rotate 14
    compress
    delaycompress
    missingok
    notifempty
    create 644 apache apache
}

Security Checklist

Before going to production:

  • [ ] HTTPS enabled with valid certificate
  • [ ] Strong database passwords set (ragdb + chatdb)
  • [ ] API_DEBUG_LOGGING_ENABLED=false
  • [ ] UPLOAD_API_KEY set (required for upload and media-info)
  • [ ] ALLOWED_ORIGINS configured for CORS whitelist
  • [ ] ADMIN_API_KEY and CHAT_API_KEY set with strong values
  • [ ] ADMIN_USERNAME and ADMIN_PASSWORD set for dashboard
  • [ ] SELinux enabled and configured
  • [ ] Firewall configured
  • [ ] File permissions restricted
  • [ ] Tenant .env files secured (chmod 640)
  • [ ] Security headers configured
  • [ ] Log rotation configured
  • [ ] Monitoring in place

Reporting Vulnerabilities

If you discover a security vulnerability:

  1. Do not open a public issue
  2. Email security@your-organization.com
  3. Include steps to reproduce
  4. Allow time for a fix before disclosure