KB-4F36

Dieu 37 P1: Health Performance Fix Report

6 min read Revision 1

Dieu 37 P1: Health Performance Fix Report

Date: 2026-04-02 Script: dot/bin/dot-collection-health Version: v2.0.0 -> v2.1.1 PRs: #662 (H14-H17 PG-only), #663 (Checks 1,2,11 bulk SQL)


Changes Summary

Check Before (v2.0.0) After (v2.1.1) Method
Check 1 (Tier-1 fields) Per-collection loop (133+ SSH queries) 1 bulk SQL CROSS JOIN + NOT EXISTS
Check 2 (Tier-2 fields) Per-collection loop 1 bulk SQL Same pattern
Check 9 (Group invalid) Hardcoded English list (WRONG) Cross-reference via collection_registry Dynamic subquery
Check 11 (sync fields) Per-collection loop (80+ SSH queries) 1 bulk SQL CROSS JOIN + NOT EXISTS
H14 (FK constraints) 133+ Directus API calls (curl) 2 SQL queries (directus_fields + information_schema) PG-only, 0 API
H15 (Permissions) N*M individual SQL (5 policies x 133 cols) 3 bulk INSERT...WHERE NOT EXISTS CTE with RETURNING
H16 (Species mapping) Per-collection loop 2 bulk queries INSERT...ON CONFLICT
H17 (No DELETE) Per-collection loop 1 SELECT + 1 bulk DELETE Retroactive all governed
Governance_role Not validated Unknown roles flagged governed/observed/excluded recognized

Key Design Decisions

  1. Assembly First (NT8+NT11): directus_fields, directus_permissions, directus_policies are PG tables. Read via SQL, never API.
  2. Luat Bao toan retroactive: H17 applies to ALL governed collections from collection_registry, not just 6 new tables.
  3. Check 9 fix: Old hardcoded list ('CMS','LOG','REGISTRY',...) was English; actual data is Vietnamese (cau_truc, cong_cu...). No SSOT table for atom_group exists. Fixed via cross-reference: atom_group is valid if its meta_catalog collection also exists in collection_registry.
  4. governance_role validation: 3 stable roles recognized (governed, observed, excluded). Any new role = governance law change = flagged + system_issues.

Production Verification Evidence (VPS 38.242.240.89)

V4: Full health < 60s

real	0m18.910s
user	0m2.020s
sys	0m4.289s

PASS - 18.9s (was 4m37s with v2.1.0, which was already improved from v2.0.0). 14.7x improvement.

V5: system_issues logged

SELECT id, title, issue_type, source, detected_at FROM system_issues
WHERE source='dot-collection-health' ORDER BY detected_at DESC LIMIT 5;

4053|DELETE permission found: pivot_definitions|collection_onboarding_gap|dot-collection-health|2026-04-02 04:44:17.959
4052|DELETE permission found: birth_registry|collection_onboarding_gap|dot-collection-health|2026-04-02 04:44:17.758
4051|DELETE permission found: entity_species|collection_onboarding_gap|dot-collection-health|2026-04-02 04:44:17.606
4050|DELETE permission found: entity_species|collection_onboarding_gap|dot-collection-health|2026-04-02 04:44:17.414
4049|DELETE permission found: entity_species|collection_onboarding_gap|dot-collection-health|2026-04-02 04:44:17.199

PASS - system_issues logged for DELETE removals, permissions gaps, species gaps.

V6: DELETE permissions on ALL governed = 0

SELECT count(*) FROM directus_permissions dp
WHERE dp.action='delete'
  AND dp.collection IN (
    SELECT collection_name FROM collection_registry
    WHERE governance_role='governed' AND status IS DISTINCT FROM 'retired'
  );

0

PASS - Zero DELETE permissions on governed collections.

V7: Test DELETE -> health auto-removes

-- Inserted test DELETE permission on dot_tools (id=1227)
INSERT INTO directus_permissions ... VALUES ('dot_tools','delete',...) RETURNING id;
1227|dot_tools|delete

-- Ran health -> output:
[WARN]   dot_tools: DELETE permission (policy: AI Agent Policy) — removing...
DELETE 1
[OK]     FIXED: removed 1 DELETE permissions
[WARN] CHECK 17: 1 governed collections had DELETE permissions (removed)

-- system_issues logged:
DELETE permission found: dot_tools|2026-04-02 04:49:35.703

PASS - Self-healing confirmed: auto-removed + logged.

V8: Idempotent (second run = 0 changes)

17 checks, 6 issue(s), 0 auto-fix(es), compliance 95.9%
[OK]   CHECK 14 PASS
[OK]   CHECK 15 PASS
[OK]   CHECK 17 PASS

PASS - 0 auto-fixes on second run. H14/H15/H17 all PASS.

V9: No hardcoded numbers

grep -n 'CHECKS=\|COUNT=\|TOTAL=' dot/bin/dot-collection-health
# Only =0 initializations and $((CHECKS+1)) increments

PASS - No magic numbers.


Known Issues (not in scope)

  • Check 3: 15 active in meta_catalog but missing in PG (pre-existing)
  • Check 4: 20 orphans in table_registry (pre-existing)
  • Check 8: 139 field_type conflicts (pre-existing)
  • Check 10: 142 NULL atom_group (pre-existing)
  • Check 16: 25 governed collections with NULL species_code (data gap, not fixable by script)

Anti-patterns Documented

  • AP: Directus API does NOT create PG FK constraints -> DOT must ALTER TABLE
  • AP: Script using Directus API when PG table has same data = violates NT8 Assembly First
  • AP: Verify must check positive (sufficient) + negative (forbidden)
  • AP: New governance rules must be retroactive for existing data
  • AP: Hardcoded English enum list broke when data migrated to Vietnamese