S175 Vaccine Schema Audit
S175 Vaccine — Schema Audit Report
Date: 2026-04-09 Mode: READ-ONLY Agent: Claude Code Goal: Find tables with same "upsert by code" anti-pattern as knowledge_documents
Summary
Audited 168 tables in public schema (Directus DB).
- 2 tables HIGH priority (have data + writer + missing UNIQUE on logical_id) — needs vaccine
- 5 tables MEDIUM priority (have logical_id col + missing UNIQUE, low data or no active writer)
- Rest LOW (proper UNIQUE in place or no logical_id)
The biggest finding: system_issues.violation_hash — used by S174-FIX-04 DOTs (env-contract-check, logrotate-config-check, rsyslog-health-check) and original S175 reportConfigError — already has 2 OPEN duplicate pairs from dieu31-runner. Same class of bug as the original S175 root cause but in a different table.
Inventory: Tables Audited
public schema, excluding directus_* system tables: 168 tables.
Tables with logical_id columns: 65 (path, slug, code, source_id, hash, key, file_path, dimension)
Tables WITH UNIQUE on logical_id (good): 35
agents, agent_views, approval_requests, binding_registry, block_library, checkpoint_sets, checkpoint_types, collection_groups, collection_registry, comments, derived_objects_registry, design_templates, dot_domains, dot_tools, entity_dependencies, entity_species, governance_registry, kg_thresholds, knowledge_documents (S175 fix), languages, meta_catalog, modules, normative_registry, pivot_definitions, system_issues (code only), table_proposals (only PK), taxonomy, taxonomy_facets, trigger_registry, ui_pages, universal_edges, universal_rule_violations, workflow_categories, workflow_change_requests, workflow_steps
Tables MISSING UNIQUE on logical_id: 9 candidates examined
| Table | Logical col | Has data? | Has writer? | Distinct/Total | Priority |
|---|---|---|---|---|---|
| system_issues | violation_hash | 165 (2 dup pairs OPEN) | YES (S174 DOTs) | 148/165 | HIGH |
| system_issues | business_logic_hash | 165 | YES | 12/165 | low (collision intentional) |
| categories | slug | 1 | unclear | 1/1 | LOW (1 row) |
| posts | slug | 2 | unclear | 2/2 | LOW |
| forms | key | 0 | unclear | 0/0 | LOW (empty) |
| tasks | code | 10 | YES (Directus Flow) | 10/10 | MED |
| workflow_change_requests | code | 3 | unclear | 3/3 | LOW |
| registry_changelog | code | 0 (15317 rows but code col mostly NULL) | unclear | 0/0 | LOW (audit log) |
| help_articles | slug | 3 | unclear | -/3 | LOW |
| help_collections | slug | -/- | unclear | -/- | LOW |
HIGH Priority Findings
H1. system_issues.violation_hash — Same root cause as S175
Used by:
- S174-FIX-04:
env-contract-check.sh,logrotate-config-check.sh,rsyslog-health-check.sh - S175 P1:
directus_sync.pyoriginalreportConfigErrorfor crash reporting - Internal Directus dedupe via
dedupe.js
Schema gap:
SELECT indexdef FROM pg_indexes WHERE tablename = 'system_issues';
→ system_issues_pkey (id)
→ idx_system_issues_system_issues_code_unique (code) -- auto-gen by Directus, not logical id
→ idx_issues_open (status WHERE status='open')
No unique on violation_hash. code is Directus auto-gen (ISS-XXXX), not the dedupe key.
Evidence of bug already manifest:
SELECT violation_hash, status, COUNT(*) FROM system_issues
WHERE violation_hash IS NOT NULL
GROUP BY violation_hash, status HAVING COUNT(*) > 1;
violation_hash | status | count
-----------------+----------+------
44815e74c3d0f846 | open | 2 ← active duplicate
9cc0eb3d7958f6cf | open | 2 ← active duplicate
0653e4f1c9c519d6 | resolved | 4
1bbc0770d083c274 | resolved | 4
aa8df24153c26d4b | resolved | 4
... (5 more resolved with 2-3x dup)
The 2 OPEN dup pairs:
id | violation_hash | issue_class | source_system | last_seen_at
-----+------------------+-------------+----------------+---------------------------
3896 | 44815e74c3d0f846 | sync_fault | dieu31-runner | 2026-04-09 10:00:25.802+00
3994 | 44815e74c3d0f846 | sync_fault | dieu31-runner | 2026-03-30 10:00:15.042+00
3884 | 9cc0eb3d7958f6cf | sync_fault | dieu31-runner | 2026-04-09 10:00:17.869+00
3999 | 9cc0eb3d7958f6cf | sync_fault | dieu31-runner | 2026-04-01 04:00:27.476+00
Mechanism (same as S175 root cause):
dedupe.jscallsfindExistingIssue(violationHash)withlimit=1- If 2 simultaneous calls don't find each other → both POST → 2 rows
- Or: previous resolved row counted as "not found" → new row created while old still pending re-open
- No DB-level UNIQUE catches it
Reference: universal_rule_violations already has the right pattern:
CREATE UNIQUE INDEX uq_open_violation_hash
ON public.universal_rule_violations USING btree (violation_hash)
WHERE ((status)::text = 'open'::text)
Proposed vaccine (not applied):
-- Step 1: dedupe existing OPEN duplicates (manual review needed)
-- Step 2:
CREATE UNIQUE INDEX CONCURRENTLY idx_system_issues_open_violation_hash_unique
ON public.system_issues USING btree (violation_hash)
WHERE status = 'open' AND violation_hash IS NOT NULL;
Effort: ~30 min (dedupe 2 OPEN pairs + create index). LOW risk if done in maintenance window.
H2. knowledge_documents.source_id — Already fixed in S175
For completeness: this is the table fixed in S175 P2 with idx_kd_current_source_id_unique. No further action needed.
MEDIUM Priority Findings
M1. tasks.code
- 10 rows, all distinct codes (no current duplicate)
codelooks likeTSK-XXXXDirectus auto-gen pattern- Found in writer search:
items/tasks(agent-data writes via REST) - No unique constraint on
code(only PK onid)
Risk: If 2 simultaneous task creates collide on the auto-gen sequence, duplicate codes possible. Currently no evidence of duplicates.
Vaccine (deferred):
CREATE UNIQUE INDEX idx_tasks_code_unique ON tasks (code) WHERE code IS NOT NULL;
M2. Other low-data tables
categories.slug(1 row),posts.slug(2 rows),help_articles.slug(3 rows),help_collections.slug- Sit at "could become an issue" stage but not currently corrupted
- Defer until data growth or active writer identified
LOW Priority
L1. Tables with data but logical_id well-protected
35 tables have proper UNIQUE constraints on their logical_id columns. No vaccine needed.
L2. Log/audit tables with no logical_id semantics
birth_registry(34084 rows) — trigger-managed, no logical_idregistry_changelog(15317 rows) — audit log, no logical_idmeasurement_log(5050 rows) — log, dedupe by composite (run_id, measurement_id)lifecycle_log(827 rows) — append-only loguniversal_rule_violations(1741 rows) — hasuq_open_violation_hashpartial unique ✓
No vaccine needed.
Matrix Summary
| Priority | Table | Logical col | Issue | Vaccine effort |
|---|---|---|---|---|
| HIGH | system_issues | violation_hash | 2 OPEN duplicates already | ~30 min |
| MED | tasks | code | No constraint, no current dup | ~15 min |
| LOW | categories,posts,forms,help_*,workflow_change_requests,registry_changelog | slug/key/code | Low data, no current dup | defer |
| LOW | 35 other tables | various | UNIQUE in place | none |
Recommendations for Next Session
Priority 1: system_issues.violation_hash vaccine
Why now: Already has 2 active OPEN duplicates from dieu31-runner. The S174-FIX-04 DOTs we shipped this week WILL create more duplicates over time without this constraint. Same root cause as S175.
Steps:
- Manual review of 2 OPEN dup pairs (3896/3994, 3884/3999) — likely both in each pair are stale/equivalent
- Resolve the older row in each pair (set
status='resolved') - Apply partial UNIQUE:
CREATE UNIQUE INDEX ... WHERE status='open' AND violation_hash IS NOT NULL - Update
dedupe.jsand S174 DOT scripts to handle UNIQUE violation gracefully (catch and reuse existing)
Priority 2: tasks.code vaccine
Defensive only. Not blocking, but low effort.
Priority 3: reconcile-knowledge.py cleanup
From C3 closure verify: this script bypasses the atomic writer. Either:
- Refactor to call
_sync_document_atomicdirectly - Or document as deprecated and add a warning at top of file
- Or remove if no longer needed
Unknowns / Need clarification
-
tasks.codewriter source: Found in agent-data grep but unclear if it's actively writing tasks or just reading. Need to check if Directus Flows are also creating tasks (likely yes per memory). -
categories,posts, etc.: Only 1-3 rows each. Unclear if these are seeds or actively used. If not used, can defer indefinitely. -
registry_changelog: 15317 rows butcodemostly NULL — likely an audit log wherecodereferences entity_codes from other tables. Schema audit can't determine semantics without code reading.
Constraints Honored
- READ-ONLY: 0 INSERT/UPDATE/DELETE/DDL executed (except inside ROLLBACK transaction)
- No fix attempted, even on HIGH priority
- Scope limited to
directusDB - All claims backed by SQL output or grep result