S175 Followup — README Duplicate Groups
S175 Followup — README Duplicate Groups
Status: OPEN — out of scope S175, deferred to next session Tag: s175-followup Date: 2026-04-09
Discovery
During S175 P3 backfill of 15 NULL source_id rows, Claude Code discovered that
3 of the 15 NULL rows are the NULL side of 3 README duplicate pairs. Backfilling
them would have triggered idx_kd_current_source_id_unique partial UNIQUE
violation and rolled back the entire backfill transaction.
6 Rows in 3 Duplicate Groups
| file_path | id (NULL) | id (set) | source_id (set side) |
|---|---|---|---|
| knowledge/current-state/README.md | 298 | 345 | agentdata:knowledge/current-state/README.md |
| knowledge/current-tasks/README.md | 299 | 346 | agentdata:knowledge/current-tasks/README.md |
| knowledge/other/README.md | 301 | 347 | agentdata:knowledge/other/README.md |
Note: knowledge/dev/README.md (id=300) was NOT a duplicate — it was
backfilled successfully in S175 P3 VIỆC 2.
Evidence (P0 snapshot)
Snapshot: /root/backup/s175-knowledge_documents-20260409T072635Z.sql
SELECT id, file_path, source_id, is_current_version, version_number
FROM public.knowledge_documents
WHERE id IN (298, 299, 301, 345, 346, 347)
ORDER BY file_path, id;
All 6 rows have is_current_version=true. Both rows in each pair pass the
sidebar query filter (status='published' AND is_current_version=true),
so Nuxt sidebar renders 2 entries per README.
Why Out of Scope for S175
S175 scope: fix Directus drift root cause (commit bug + backfill 12 clean rows). Cleaning README dups requires:
- Read content of BOTH rows in each pair (NULL side may have different content)
- Decide canonical row (newest? content-richer? source-marked?)
- Possibly merge content from one to other
- Archive non-canonical row (set
is_current_version=false) - Verify partial UNIQUE constraint passes after archive
This is data-loss-risky and requires content-level decisions, not a mechanical fix.
Next Session Plan
SELECT id, content FROM knowledge_documents WHERE id IN (298,299,301,345,346,347);diffcontent of each pair- Decide canonical per pair
BEGIN; UPDATE ... SET is_current_version=false WHERE id=<non_canonical>; COMMIT;- Verify only 3 current rows remain for the 3 README paths
- Consider adding
source_id NOT NULLconstraint after all NULL rows resolved
Constraint Consideration
After resolving all NULL source_id rows, schema should add:
ALTER TABLE knowledge_documents
ALTER COLUMN source_id SET NOT NULL;
This prevents future writes from creating NULL source_id rows that bypass the partial UNIQUE constraint.
Related Files
/root/backup/s175-knowledge_documents-20260409T072635Z.sql— P0 snapshot/opt/incomex/docker/agent-data-repo/agent_data/directus_sync.py— atomic writer (fixed in S175 P3)knowledge/current-state/reports/s175-fix-execution-v3.md— S175 P3 report