KB-6ABC
recovery-runbook.md
5 min read Revision 1
VPS Recovery Runbook
Last updated: 2026-02-28 | Status: Active
Overview
This runbook covers recovery procedures for the INCOMEX VPS infrastructure (38.242.240.89). All services run as Docker containers managed by docker-compose.
Service Architecture
| Service | Container | Port | Health Check |
|---|---|---|---|
| MySQL 8.0 | incomex-mysql | 3306 (internal) | mysqladmin ping |
| Qdrant | incomex-qdrant | 6333 (internal) | TCP check |
| Directus 11.5 | incomex-directus | 8055 (internal) | /server/info |
| Agent Data | incomex-agent-data | 8000 (internal) | /info |
| Nuxt SSR | incomex-nuxt | 3000 (internal) | TCP 3000 |
| Nginx | incomex-nginx | 80, 443 | N/A |
| Uptime Kuma | uptime-kuma | 3001 | Built-in |
Scenario 1: Single Container Crash
Symptoms: One service down, others healthy
Recovery:
ssh -i ~/.ssh/contabo_vps root@38.242.240.89
cd /opt/incomex/docker
docker compose up -d <service-name>
docker ps --format "table {{.Names}}\t{{.Status}}"
Verification: Run /opt/incomex/scripts/test-mcp-connectivity.sh
Scenario 2: Full Stack Down
Symptoms: All services unreachable, VPS accessible via SSH
Recovery:
ssh -i ~/.ssh/contabo_vps root@38.242.240.89
cd /opt/incomex/docker
docker compose down
docker compose up -d
sleep 90 # Wait for all health checks
docker ps --format "table {{.Names}}\t{{.Status}}"
/opt/incomex/scripts/test-mcp-connectivity.sh
Post-recovery checks:
- MCP connectivity:
/opt/incomex/scripts/test-mcp-connectivity.sh - Config integrity:
/opt/incomex/scripts/check-config-integrity.sh - Event system listeners >= 1
- Uptime Kuma dashboard:
http://38.242.240.89:3001
Scenario 3: Agent Data Deployment
Standard deploy after PR merge:
ssh -i ~/.ssh/contabo_vps root@38.242.240.89
cd /opt/incomex/docker
docker compose pull agent-data
docker compose up -d agent-data
sleep 45 # Startup + warm-up
curl -sf https://vps.incomexsaigoncorp.vn/api/health
Rollback (if new image fails):
docker images --format "{{.Repository}}:{{.Tag}} {{.CreatedAt}}" | grep agent-data
# Use previous image tag in docker-compose override
Scenario 4: Database Recovery
MySQL (Directus data)
Backups at /opt/incomex/backups/mysql/ (daily 2AM, 7-day retention)
docker exec -i incomex-mysql mysql -u root -p<password> <db> < backup-file.sql
docker restart incomex-directus
Qdrant (Vector DB)
Backups at /opt/incomex/backups/qdrant/ (daily 3AM, 7-day retention)
docker compose stop qdrant
cp -r /opt/incomex/backups/qdrant/latest/* /opt/incomex/docker/qdrant/data/
docker compose up -d qdrant
curl -X POST https://vps.incomexsaigoncorp.vn/api/kb/reindex -H "X-API-Key: $API_KEY"
Scenario 5: SSL Certificate Issue
certbot certificates
certbot renew --force-renewal
docker restart incomex-nginx
Scenario 6: Disk Full
df -h /
/opt/incomex/scripts/disk-monitor.sh # Auto-prunes at 85%
docker system prune -f
Scenario 7: MCP Transport Failure
Diagnosis: /opt/incomex/scripts/test-mcp-connectivity.sh
Common fixes:
- Nginx config corrupt:
docker restart incomex-nginx - Agent Data unresponsive:
docker restart incomex-agent-data(wait 45s) - DNS issue:
dig vps.incomexsaigoncorp.vn - API key mismatch:
grep AGENT_DATA_API_KEY /opt/incomex/docker/.env
Scenario 8: Directus Sync Not Working
Diagnosis: Check event system listeners count via /info endpoint
Fix (if listeners=0):
- Verify
DIRECTUS_ADMIN_TOKENandDIRECTUS_URLin docker-compose.yml - Verify
.envhasDIRECTUS_ADMIN_TOKENvalue - Recreate:
docker compose up -d agent-data
Key Files
| Path | Purpose |
|---|---|
| /opt/incomex/docker/docker-compose.yml | Stack definition |
| /opt/incomex/docker/.env | Environment variables (secrets) |
| /opt/incomex/scripts/ | Monitoring and backup scripts |
| /opt/incomex/backups/ | MySQL and Qdrant backups |
| /opt/incomex/.checksums/ | Config integrity baselines |
| /opt/incomex/.uptime-kuma-admin-pass | Uptime Kuma admin password |
Monitoring
- Uptime Kuma: 4 monitors (MCP, Agent Data, Directus, OPS Proxy) at http://38.242.240.89:3001
- Cron MCP test: Every 5 min -> /var/log/mcp-health.log
- Cron config check: Hourly -> /var/log/config-integrity.log
- Backups: MySQL daily 2AM, Qdrant daily 3AM, 7-day retention
- Docker logs: max 50MB x 3 files per container