02-registrar-path-normalization-defect-and-mitigation-proof-2026-06-22.md
02 — G1: Registrar duplicate defect — root cause + mitigation, PROVEN by simulation over real data
SSOT note (see 00a): the registrar mechanism is grounded in the VPS SSOT
/opt/incomex/dot/bin/dot-dot-register(sha31d5cf15…), not local. The VPS↔local diff is only 3 PG-env-default lines; the matcher/write-site lines (121/128/135/157/184) are byte-identical, so the line citations below are authoritative for the VPS runtime. The patched artifact isLOCAL_STAGING_NOT_SSOT(operator-applied to the VPS viapatch_ops_code).
Claim hardened: the prior "287 duplicate" finding is real but under-diagnosed. Fresh probe shows it is a multi-format path-join failure, and — critically — even a correct matcher is still unsafe for surgical C1 registration without an include filter. Both are proven below with the actual patched-registrar logic run over real live inputs.
1. Exact stored vs scanned values (live, read-only)
- Scanned (disk):
ls -1 /opt/incomex/dot/bin/dot-*→ 287 absolute paths, e.g./opt/incomex/dot/bin/dot-accuracy-verify. 76 are.bak*/.stage0-frozen*backups caught by thedot-*glob. - Stored (
dot_tools.file_path): 228 non-null, in three prefixes —bin/dot/dot-accuracy-verify(163),opt/incomex/dot/bin/…(63),dot/…(2). None has a leading slash; one whole convention (bin/dot/…) is a different path entirely.
2. The exact matching bug (source-read, dot-dot-register v1.0.0 line 135)
if echo "$REGISTERED" | grep -qF "$filepath"; then SKIPPED ; fi # $filepath = /opt/incomex/dot/bin/dot-X
grep -qF "/opt/incomex/dot/bin/dot-X" searches the stored list for that absolute string. No stored line contains it (bin/dot/dot-X and opt/incomex/… both lack the leading-slash absolute form). ⇒ every one of the 287 disk files is classed "NEW." On a real (--dry-run removed) run line 156 would POST /items/dot_tools 287 times — duplicating the whole bin plus 76 backups.
3. Reproduction — OLD matcher over real live inputs (exit-checked)
staged-artifacts/evidence/runA-old-matcher-repro.txt:
OLD-matcher GENUINE NEW = 287 (== live --dry-run "287 new" from report 02 of prior pkg)
Inputs are the real live snapshots evidence/disk.txt (287), evidence/registered.txt (228), evidence/codes.txt (309).
4. Mitigation (DOT-100%, reuse-first) — patch the governed registrar
Staged artifact: staged-artifacts/scripts/dot-dot-register-c1-hardened (v1.1.0-c1-hardened). It is a patch to the existing governed DOT-REGISTER, not a new system. Route for the patch = patch_ops_code (apr_action_types handler_ref=dot-apr-execute:patch_ops). Three fixes:
- D1 fix — join by BASENAME (
dot-X), the only stable key across/opt/…,bin/dot/…,opt/…,dot/…. - D2 fix — second idempotency guard by derived CODE (
basename|upper|tr - _) vs existingdot_tools.code. - D3 fix — blast-radius control: mandatory
--only-prefix dot-c1-include filter for surgical runs; backup denylist (.bak*,.stage0-frozen*,.orig,~,.swp,.tmp); and a--max-new(default 10) abort guard that refuses a bulk real-run (exit 3) unless--allow-bulkis given.
5. Mitigation PROVEN — patched logic run over real inputs (real exit codes)
| Run | Command (offline, no prod) | GENUINE NEW | exit | Meaning |
|---|---|---|---|---|
| A (old) | original grep logic | 287 | — | the bug, reproduced |
| B (new, no filter) | --simulate-from … --dry-run |
15 | 0 | normalization+idempotency alone still inserts 15 unrelated backlog scripts ⇒ not surgical |
| C1 (new, C1 filter, today) | --only-prefix dot-c1- --dry-run |
0 | 0 | no C1 on disk yet ⇒ nothing registered |
| C2 (new, C1 filter, post-W1) | C1 scripts staged on bin | 7 | 0 | diff = exactly the 7 named DOT_C1_ rows, nothing else* |
Run B breakdown (real stdout, evidence/runB.txt): disk 287 → skipped backup 78, skipped-by-basename 180, skipped-by-code 14, GENUINE NEW 15. Run C2 stdout lists exactly:
DOT_C1_VOCAB_BUILD, DOT_C1_VOCAB_VERIFY, DOT_C1_PREFLIGHT, DOT_C1_BAD_INPUT_HARNESS,
DOT_C1_EVIDENCE_READBACK, DOT_C1_ROLLBACK_CHECK, DOT_C1_CONTRACT_REGISTER
6. Hardening conclusion (the key new finding)
The prior plan assumed a normalization fix would make a bare registrar safe and otherwise used a manual POST. Run B disproves that: a corrected matcher still inserts 15 unrelated rows. Therefore:
- Bare/normalized
dot-dot-registeris FORBIDDEN for C1 (blast radius 287 old / 15 even when fixed). - The only safe registration is the patched registrar with
--only-prefix dot-c1-(+ the--max-newabort guard as a second net). Expected diff is then provably limited to the namedDOT_C1_*rows (Run C2).
7. Acceptance vs macro §3.1
- actual stored file_paths ✔ (§1) · actual scanned ✔ (§1) · exact matching logic causing false-new ✔ (§2) · why bare real-run forbidden ✔ (§6) · patch/mitigation design ✔ (§4) · dry-run after mitigation ✔ (§5, exit 0) · expected diff only DOT_C1 named rows ✔ (Run C2) · exit code ✔ · stdout summary ✔ (evidence/).
- Not rejected: real-run would NOT insert 287 (filter), matcher no longer matches only raw file_path (basename+code), idempotency guard present (×2), mitigation does NOT depend on manual Directus edits (it IS the governed registrar). G1 = RESOLVED.
Caveat (honest): the mitigation dry-run is an offline run of the patched script's real matching logic over real live snapshots, not a live patched-registrar run — because running it live would require staging the patch + C1 scripts onto prod (a write, forbidden this turn). The logic, inputs, and exit codes are real; the final live confirmation is W1's dry-run at execution time.