Let's Talk Login

KB-66FE

Qdrant Investigation Report 2026-03-31

2 min read Revision 1

reportqdrantinvestigationgpt-actionsdegraded

Qdrant Investigation Report — 2026-03-31

Mission: S135H — Investigate Qdrant degraded + Runtime action path Status: INVESTIGATION COMPLETE — NO FIXES APPLIED

1. QDRANT STATUS: HEALTHY

Container: Up 13 days, CPU 0.27%, RAM 103.7MiB/1GiB (10%)
Collection: production_documents, 1172 points
Local latency: 0.37ms. No errors in logs. Disk 52%.

2. AGENT DATA HEALTH: DEGRADED (STALE CACHE)

Health shows degraded, Qdrant Connection error, latency 753.4ms
THIS IS CACHED from startup probe 2026-03-29. NOT realtime.
Direct Python test from container: OK, 473.8ms
Docker network test: OK, 2.85ms

3. SEARCH TESTS: WORKING BUT SLOW

/chat via HTTPS: 5/5 OK, qdrant_hits=5, but latency 8-27s
Latency breakdown: embedding (OpenAI) + vector search + TLS overhead
GPT Actions timeout ~30s, total latency ~20-27s = near timeout

4. ROOT CAUSE

Health degraded = stale cached value from startup, not current state
GPT intermittent because /chat latency (8-27s) near GPT 30s timeout
MCP stable because: local-to-cloud failover + no hard timeout
No circuit breaker: when Qdrant slow, retries 3x15s = 45s worst case

5. ENDPOINT TABLE

/chat: Qdrant YES (vector search), PG fallback, GPT uses
/documents GET: Qdrant optional (related docs), PG YES, GPT uses
/kb/list, /kb/get: PG only, GPT uses
/health: Probe only, not used by GPT

6. CODE CONFIG

Qdrant timeout: 15s hardcoded (vector_store.py:137)
Retry: 3x exponential 1-4s (resilient_client.py:36-38)
Health cache TTL: 30s, no periodic re-probe
Circuit breaker: NONE

7. RECOMMENDATIONS (NOT APPLIED)

Periodic health re-probe (background task, not just startup)
Reduce Qdrant timeout 15s to 3-5s for local Docker
Add circuit breaker: skip Qdrant when DOWN, use PG fallback immediately
Cache embeddings to reduce /chat total latency
Auto noop_qdrant when health degraded for faster PG-only responses