KB-27E3 rev 6

Data Sync Architecture v3.0

5 min read Revision 6
architecturedata-syncdirectusagent-dataflowss126

Data Sync Architecture

v4.0 | 2026-03-17 S128-C — Cập nhật sau audit toàn diện. TD-219✅ TD-223✅ TD-220✅. Registry sync 7/7. Vector orphan=0. Knowledge Sync INACTIVE by design. search_knowledge("data sync architecture") Kiến trúc tổng thể: search_knowledge("vision 3 streams matrix")


Overview

Hệ thống dùng 5 loại data flow giữa Directus CMS, Agent Data KB, và codebase. Mỗi collection sync MỘT CHIỀU DUY NHẤT để tránh circular loops.

Tổng Directus Flows: ~127 (123 active, 4 inactive). Tổng PG Triggers: ~85.


Type 1: Agent Data → Directus (Knowledge Documents)

  • Collection: knowledge_documents
  • Direction: Agent Data là SSOT
  • Mechanism: directus_sync.py (prefix filter knowledge/)
  • Status: ✅ ACTIVE
  • Lưu ý: Tool dot-knowledge-sync-agentdata fail token (TD-221, P3). directus_sync.py vẫn hoạt động.

Type 2: Directus → Agent Data (Tasks + Comments)

  • Collections: tasks, task_comments
  • Direction: Directus là SSOT
  • Mechanism: 6 Directus Flows [DOT] (create/update/delete × 2 collections)
  • Status: ✅ ACTIVE — Verified ≤15s (S127-D: 6/6 operations pass)
  • Token: Flows dùng {{$env.AGENT_DATA_API_KEY}} — env var, không expire (S127-D confirmed)
  • 15 orphan comments → ✅ Đã dọn S127 (TD-219)

Type 3: Directus → Agent Data (Registry Sync)

  • Collections: table_registry, workflows, workflow_steps, WCR, workflow_categories, workflow_step_relations, meta_catalog (7 collections)
  • Direction: Directus là SSOT
  • Mechanism: 22 Directus Flows [DOT-REG] (S128-C verified: 22 active, 0 inactive)
  • Status: ✅ ACTIVE — 7/7 ALL MATCH (S127-E: 99 records synced, 3 orphans cleaned)
  • Thiếu initial sync → ✅ Đã sync S127-E (TD-223)

Type 4: Auto-ID + Counting (PG Triggers = SSOT)

  • Counting: 100% PG triggers (21 trg_count_*). PG là SSOT duy nhất cho record_count + active_count.
  • Auto-code: PG triggers (19+ trg_auto_code_*) gán PREFIX-NNN.
  • Nuxt refresh: CHỈ refresh record_count từng collection đơn lẻ. KHÔNG tính CAT-ALL (S128-A fix: PG trigger fn_refresh_virtual_summaries là SSOT cho CAT-ALL).
  • Bài học S128-A: Nhiều đường cập nhật = xung đột. Chọn 1 SSOT (PG), các đường khác CHỈ ĐỌC.

Type 5: Agent Data Only (Operations/Sessions/Reports)

  • Direction: Agent Data only
  • Mechanism: Direct upload via MCP tools / API
  • Status: ✅ ACTIVE
  • Patterns: knowledge/*, operations/*, registries/*

Vector Integrity (S128-C)

Metric Giá trị
Documents ~526
Vectors ~727
Ghost docs 2 (knowledge/dev, knowledge/dev/blueprints — folder nodes, not bug)
Orphan vectors 0 (S127-B fix)
Pipeline CRUD→Vector 3/3 verified (create/update/delete)
Qdrant status OK (HC 10b monitor)
cleanup-orphans Batch optimized (S127-C)

Anti-Circular Sync Principles

  1. One direction per collection: Each collection syncs in exactly ONE direction
  2. No reverse flows: If A→B exists, B→A must NOT exist for same collection
  3. Source of truth: Source side owns data; target is read-only
  4. Knowledge Sync/Delete flows: ✅ INACTIVE BY DESIGN (S128-B confirmed: anti-circular, S126 audit 3 agents consensus)
  5. Origin marker: registries/* prefix prevents Agent Data → Directus reverse fire

Thống kê (S128-C)

Loại Số lượng Status
[DOT] Task/Comment sync 6 ✅ Active
[DOT-REG] Registry sync 22 ✅ Active (S128-C verified)
[AUTO-ID] Auto-ID assign ~13 ✅ Active
Knowledge Sync/Delete 2 ❌ INACTIVE (by design)
PG Triggers counting 21 ✅ Active
PG Triggers auto_code 19+ ✅ Active
PG Triggers validate_origin 21 ✅ Active
PG Triggers label_assign 20+ ✅ Active
Tổng Directus Flows ~123 active
Tổng PG Triggers ~85

CI/CD Deploy (S127-E)

Repo Trigger Cơ chế
web-test Merge to main (mọi file) GH Actions → SSH → VPS docker compose
agent-data-test Merge to main (agent_data/ + dot/ + Dockerfile + requirements) GH Actions → rsync → VPS docker build

S127-E fix: Thêm dot/** vào paths filter → DOT tools auto-deploy.