S175-VACCINE-01 system_issues Dedupe + Partial UNIQUE
S175-VACCINE-01 — system_issues Dedupe + Partial UNIQUE
Date: 2026-04-09 Session: S175-VACCINE-01 Agent: Claude Code Mode: Edit allowed (DDL + DML, scope hẹp system_issues only)
Summary
Fixed gốc vĩnh viễn cho public.system_issues — bảng sử dụng "upsert by code"
pattern (không có DB-level constraint) đã có 2 OPEN duplicate pairs từ
dieu31-runner. Áp dụng pattern S175 + reference từ universal_rule_violations:
- Dedupe 2 OPEN orphan rows (set status='archived')
- Tạo partial UNIQUE index trên
(violation_hash) WHERE status='open' AND violation_hash IS NOT NULL - Defer writer refactor per §FIX GỐC (constraint là gốc fix, fail loud > fail silent)
End-to-end verified bằng 3 hardtest scenarios, gồm 1 lần chạy thật Đ31 cron với 117 dedupe.js calls — 0 OPEN duplicates tạo mới.
Files Modified
- DB:
public.system_issues(2 rows updated, 1 index added) - KHÔNG sửa code (writer dedupe.js giữ nguyên)
Backup
/root/backup/s175v01-system_issues-20260409T104916Z.sql (375 KB, data-only dump)
P0 — Discovery
Schema findings
Table "public.system_issues"
- id (PK), code (Directus auto-gen ISS-XXXX, UNIQUE), violation_hash, status, ...
- CHECK constraint: status IN ('open','resolved','archived')
→ 'superseded' KHÔNG được phép, dùng 'archived'
- Existing indexes:
* system_issues_pkey (id)
* idx_system_issues_system_issues_code_unique (code) -- Directus auto-gen, không phải logical id
* idx_issues_open btree on status WHERE status='open' -- KHÔNG unique
- Triggers:
* trg_guard_system_issues_insert: requires source_system or source non-empty
* Other: birth_registry, auto_code, auto_label
2 OPEN duplicate pairs verified
id | violation_hash | code | first_seen_at | last_seen_at | occurrence_count
3896 | 44815e74c3d0f846 | ISS-2380 | 2026-03-28 11:00:10 | 2026-04-09 10:00:25 | 19 ← canonical (active)
3994 | 44815e74c3d0f846 | ISS-2478 | 2026-03-30 10:00:15 | 2026-03-30 10:00:15 | 1 ← orphan (never re-seen)
3884 | 9cc0eb3d7958f6cf | ISS-2368 | 2026-03-28 11:00:08 | 2026-04-09 10:00:17 | 19 ← canonical (active)
3999 | 9cc0eb3d7958f6cf | ISS-2483 | 2026-04-01 04:00:27 | 2026-04-01 04:00:27 | 1 ← orphan
Pattern: Older row in each pair (lower id, first_seen earlier) is the active one that dedupe.js keeps updating. Newer row was created during a temporary state where findExistingIssue() failed to find the active one (probably during a status flicker open→resolved→open by autoResolveStale logic, see followup section).
Reference pattern (universal_rule_violations)
CREATE UNIQUE INDEX uq_open_violation_hash
ON public.universal_rule_violations USING btree (violation_hash)
WHERE ((status)::text = 'open'::text)
Same DB, same dev team, same intent — proven pattern to copy.
Writer found: nuxt-repo/scripts/integrity/dedupe.js
Same upsert-by-code anti-pattern as original S175 directus_sync.py:
// Line 28
const url = `${DIRECTUS_URL}/items/system_issues?filter[violation_hash][_eq]=${violationHash}&filter[status][_eq]=open&limit=1`;
const resp = await fetch(url, ...);
return data?.data?.[0] || null;
// Line 123-141
const existing = await findExistingIssue(violationHash);
if (existing) {
await updateIssue(existing.id, ...); // PATCH
} else {
const issue = await createIssue({...}); // POST ← race window
}
No DB-level lock, no atomic transaction, no UNIQUE constraint catching the race. Used by S174-FIX-04 DOTs (env-contract-check, logrotate-config-check, rsyslog-health-check) and Đ31 runner main.js (PG-vs-Nuxt 117 measurement check).
P1 — Dedupe (atomic)
BEGIN;
UPDATE public.system_issues
SET status='archived',
resolution='S175v01: archived as duplicate of canonical id=3896',
resolved_at=NOW()
WHERE id = 3994;
UPDATE public.system_issues
SET status='archived',
resolution='S175v01: archived as duplicate of canonical id=3884',
resolved_at=NOW()
WHERE id = 3999;
SELECT violation_hash, COUNT(*) AS open_dup
FROM public.system_issues
WHERE status='open' AND violation_hash IS NOT NULL
GROUP BY violation_hash HAVING COUNT(*) > 1;
COMMIT;
Output:
BEGIN
UPDATE 1
UPDATE 1
violation_hash | open_dup
----------------+----------
(0 rows)
COMMIT
Post-commit verify
id | violation_hash | status | resolution | resolved_at
3884 | 9cc0eb3d7958f6cf | open | |
3896 | 44815e74c3d0f846 | open | |
3994 | 44815e74c3d0f846 | archived | S175v01: archived as duplicate of canonical id=3896 | 2026-04-09 10:57:01.228876
3999 | 9cc0eb3d7958f6cf | archived | S175v01: archived as duplicate of canonical id=3884 | 2026-04-09 10:57:01.228876
Canonical rows (3884, 3896) preserved as 'open'. Orphans (3994, 3999) archived with resolution note. 7 RESOLVED duplicate groups untouched (history preserved).
P2 — Partial UNIQUE Constraint
CREATE UNIQUE INDEX idx_sysissue_open_violation_hash_unique
ON public.system_issues USING btree (violation_hash)
WHERE ((status)::text = 'open'::text) AND violation_hash IS NOT NULL;
Output: CREATE INDEX
Verify index exists
indexname: idx_sysissue_open_violation_hash_unique
indexdef: CREATE UNIQUE INDEX idx_sysissue_open_violation_hash_unique
ON public.system_issues USING btree (violation_hash)
WHERE (((status)::text = 'open'::text) AND (violation_hash IS NOT NULL))
Reject test (BEGIN+ROLLBACK)
BEGIN;
INSERT INTO public.system_issues (
title, status, severity, source_system, issue_class,
violation_hash, business_logic_hash, first_seen_at, last_seen_at, occurrence_count
) VALUES (
'P2 TEST DUPLICATE', 'open', 'WARNING', 'dieu31-runner', 'sync_fault',
'44815e74c3d0f846', 'aaaaaaaaaaaaaaaa', NOW(), NOW(), 1
);
ROLLBACK;
Output:
BEGIN
ERROR: duplicate key value violates unique constraint "idx_sysissue_open_violation_hash_unique"
DETAIL: Key (violation_hash)=(44815e74c3d0f846) already exists.
ROLLBACK
PG correctly rejects. Constraint enforcing 1-OPEN-row-per-violation_hash invariant.
P3 — Writer Refactor: SKIPPED per §FIX GỐC
Decision
dedupe.js writer refactor DEFERRED. Lý do:
-
DB partial UNIQUE IS the gốc fix. After P2, dedupe.js race window will hit the constraint at the POST step → DB rejects → fail LOUD (HTTP 4xx + clear error message in container logs) thay vì fail SILENT (tạo duplicate row, không ai biết).
-
Fail loud > fail silent. The DOT scripts using dedupe.js will log a clear error line whenever the race fires:
[ERROR] Create issue failed: HTTP 400 — duplicate key value violates unique constraintThis is observable. The previous behavior was silent corruption. -
Refactor cost > benefit. dedupe.js is JavaScript using Directus REST API. Refactoring to atomic SQL transaction would require:
- Add
pgnpm package to scripts/integrity/ - Rewrite findExistingIssue/createIssue/updateIssue to use direct PG connection
- Manage transactions explicitly
- ~100+ lines of new code The cost is high while the benefit is just "cleaner error handling".
- Add
-
Optional minor improvement (deferred to followup): In
createIssuecatch block, detect HTTP 400 + UNIQUE violation pattern, then re-fetch existing row and PATCH instead. ~10 lines change, no new dependencies. Rationale: only do this if log noise from UNIQUE violations exceeds 5/day.
Followup trigger
If S175v01-followup needed: monitor docker logs incomex-agent-data and
container running dedupe.js for Create issue failed: HTTP 400 patterns. If
5/day, add the optional 10-line catch.
P4 — Hardtest 3 kịch bản
K1: DOT chạy lại không sinh duplicate OPEN mới — PASS
Triggered Đ31 cron-integrity.sh which runs:
- env-contract-check.sh
- logrotate-config-check.sh
- rsyslog-health-check.sh
- main.js Đ31 PG-driven runner (129 measurements, 117 use dedupe.js for sync_fault)
=== Cron output (last lines) ===
═══════════════════════════════════════
PASS: 10 | FAIL: 117 | ERROR: 1
WATCHDOG: alive
Pass Rate: 7.9% (10/127)
Issues Created: 0 | Reopened: 117
Results logged to measurement_log (run_id: cron-20260409-125754)
═══════════════════════════════════════
=== Post-cron OPEN dup check ===
SELECT COUNT(*) AS open_dup_count FROM (
SELECT violation_hash FROM system_issues
WHERE status='open' AND violation_hash IS NOT NULL
GROUP BY violation_hash HAVING COUNT(*) > 1
) x;
→ open_dup_count = 0
=== Total system_issues by status ===
status | count
---------+-------
archived | 8 (was 6 + 2 from P1)
open | 402
resolved | 658
117 dedupe calls fired, none created new duplicates. Issues Created: 0 | Reopened: 117
shows all 117 dedupe.js calls correctly went through the UPDATE branch (existing found),
not the CREATE branch. No HTTP 400 in logs (no race triggered this run).
K2: INSERT duplicate OPEN hash → DB reject — PASS
BEGIN;
INSERT INTO public.system_issues (..., status='open', violation_hash='9cc0eb3d7958f6cf', ...);
ROLLBACK;
Output:
BEGIN
ROLLBACK
ERROR: duplicate key value violates unique constraint "idx_sysissue_open_violation_hash_unique"
DETAIL: Key (violation_hash)=(9cc0eb3d7958f6cf) already exists.
K3: INSERT same hash with status='resolved' → allowed — PASS
BEGIN;
INSERT INTO public.system_issues (..., status='resolved', violation_hash='44815e74c3d0f846', ...);
SELECT COUNT(*) AS open_dup_after_insert FROM ... HAVING COUNT(*) > 1;
SELECT id, status, violation_hash FROM system_issues WHERE violation_hash = '44815e74c3d0f846' ORDER BY id;
ROLLBACK;
Output:
BEGIN
INSERT 0 1 ← INSERT succeeded (not blocked)
open_dup_after_insert
-----------------------
0 ← still 0 OPEN dups
id | status | violation_hash
------+----------+------------------
3896 | open | 44815e74c3d0f846 ← canonical preserved
3994 | archived | 44815e74c3d0f846 ← P1 archive
4299 | resolved | 44815e74c3d0f846 ← K3 historical insert OK
ROLLBACK
Constraint correctly chỉ chặn OPEN duplicate, không chặn historical (resolved/archived). Historical inserts work fine. Multiple resolved/archived rows per hash allowed.
Verification Checklist (4 evidence)
SELECT COUNT(*) GROUP BY violation_hash HAVING COUNT>1 WHERE status='open'→ 0 ✓- Index exists:
idx_sysissue_open_violation_hash_uniqueconfirmed inpg_indexes✓ - Reject test: Clear error
duplicate key value violates unique constraint "idx_sysissue_open_violation_hash_unique"✓ - Đ31 cron rerun: 117 dedupe calls, 0 new OPEN duplicates ✓
3 Bài Học (Lessons Learned)
1. Fail loud > fail silent
DB constraint là defense in depth — không sửa được code thì sửa schema. Constraint fail tạo HTTP 4xx error rõ ràng, observable trong logs. Code race tạo silent duplicate corruption không ai biết. Trade-off rõ ràng: thà có error log thừa còn hơn có dữ liệu lệch âm thầm.
2. Reference pattern trong cùng DB là gold
universal_rule_violations.uq_open_violation_hash đã có sẵn trong cùng DB,
cùng dev team, cùng intent ("1 OPEN per violation_hash"). Copy nguyên pattern,
không cần thiết kế lại. Giảm risk + prove-by-precedent ("ai đó đã làm pattern
này, dùng đã chạy → an toàn").
3. Writer refactor không phải lúc nào cũng cần
§FIX GỐC nói "giải quyết vấn đề vĩnh viễn". DB constraint là gốc fix vĩnh viễn cho ANTI-PATTERN "upsert by code race". Refactor writer là fix-for-completeness, không phải fix-for-correctness. Nếu constraint đủ chặn, defer refactor để giảm churn + tránh introduce bugs mới. "Done" ≠ "comprehensive".
Known Limitations / Followups
Out of S175v01 scope (intentional)
-
dedupe.jswriter refactor — Defer per §FIX GỐC. Trigger followup nếu HTTP 400 noise > 5/day in container logs. -
7 RESOLVED duplicate groups — Untouched, history preserved:
- aa8df24153c26d4b × 4 (resolved)
- 1bbc0770d083c274 × 4 (resolved)
- 0653e4f1c9c519d6 × 4 (resolved)
- b5aa1abecdc9a51c × 3 (resolved)
- ee6d4f80782fefcc × 3 (resolved)
- 0259368a80e0eeba × 2 (resolved)
- bbb961536e55e0ac × 2 (resolved)
These are historical artifacts. Constraint doesn't block resolved duplicates. Cleanup is cosmetic only (remove older resolved row in each group). Defer.
-
reconcile-knowledge.pybypass — From S175 closure verify report. Still pending cleanup. Not in this scope. -
tasks.codeMED priority — From vaccine audit. Defensive only, defer. -
README duplicate cleanup — From s175-readme-duplicates-followup.md. Defer.
autoResolveStale interaction note
The mechanism that probably created the original duplicates:
- Day N: dedupe.js creates row A (status=open, hash=H)
- Day N+1: violation temporarily missing → autoResolveStale resolves row A
- Day N+2: violation returns → findExistingIssue (status=open filter) doesn't find → creates row B (status=open)
- Day N+3: somehow row A re-opens → 2 OPEN rows with same hash
After P2, step 3 will fail with UNIQUE violation. The dedupe.js code should be audited for whether it re-opens resolved rows correctly (a future improvement).
Image / Container State
No code change. No image rebuild. No container restart. DB changes only:
- 2 row UPDATEs (P1)
- 1 INDEX CREATE (P2)
Backup: /root/backup/s175v01-system_issues-20260409T104916Z.sql
Rollback: docker exec -i postgres psql -U directus -d directus < <backup> then DROP INDEX idx_sysissue_open_violation_hash_unique