KB-603B

02-registrar-path-normalization-defect-and-mitigation-proof-2026-06-22.md

6 min read Revision 1
c1-legoprewrite-gate

02 — G1: Registrar duplicate defect — root cause + mitigation, PROVEN by simulation over real data

SSOT note (see 00a): the registrar mechanism is grounded in the VPS SSOT /opt/incomex/dot/bin/dot-dot-register (sha 31d5cf15…), not local. The VPS↔local diff is only 3 PG-env-default lines; the matcher/write-site lines (121/128/135/157/184) are byte-identical, so the line citations below are authoritative for the VPS runtime. The patched artifact is LOCAL_STAGING_NOT_SSOT (operator-applied to the VPS via patch_ops_code).

Claim hardened: the prior "287 duplicate" finding is real but under-diagnosed. Fresh probe shows it is a multi-format path-join failure, and — critically — even a correct matcher is still unsafe for surgical C1 registration without an include filter. Both are proven below with the actual patched-registrar logic run over real live inputs.

1. Exact stored vs scanned values (live, read-only)

  • Scanned (disk): ls -1 /opt/incomex/dot/bin/dot-* → 287 absolute paths, e.g. /opt/incomex/dot/bin/dot-accuracy-verify. 76 are .bak*/.stage0-frozen* backups caught by the dot-* glob.
  • Stored (dot_tools.file_path): 228 non-null, in three prefixes — bin/dot/dot-accuracy-verify (163), opt/incomex/dot/bin/… (63), dot/… (2). None has a leading slash; one whole convention (bin/dot/…) is a different path entirely.

2. The exact matching bug (source-read, dot-dot-register v1.0.0 line 135)

if echo "$REGISTERED" | grep -qF "$filepath"; then SKIPPED ; fi   # $filepath = /opt/incomex/dot/bin/dot-X

grep -qF "/opt/incomex/dot/bin/dot-X" searches the stored list for that absolute string. No stored line contains it (bin/dot/dot-X and opt/incomex/… both lack the leading-slash absolute form). ⇒ every one of the 287 disk files is classed "NEW." On a real (--dry-run removed) run line 156 would POST /items/dot_tools 287 times — duplicating the whole bin plus 76 backups.

3. Reproduction — OLD matcher over real live inputs (exit-checked)

staged-artifacts/evidence/runA-old-matcher-repro.txt:

OLD-matcher GENUINE NEW = 287   (== live --dry-run "287 new" from report 02 of prior pkg)

Inputs are the real live snapshots evidence/disk.txt (287), evidence/registered.txt (228), evidence/codes.txt (309).

4. Mitigation (DOT-100%, reuse-first) — patch the governed registrar

Staged artifact: staged-artifacts/scripts/dot-dot-register-c1-hardened (v1.1.0-c1-hardened). It is a patch to the existing governed DOT-REGISTER, not a new system. Route for the patch = patch_ops_code (apr_action_types handler_ref=dot-apr-execute:patch_ops). Three fixes:

  • D1 fix — join by BASENAME (dot-X), the only stable key across /opt/…, bin/dot/…, opt/…, dot/….
  • D2 fix — second idempotency guard by derived CODE (basename|upper|tr - _) vs existing dot_tools.code.
  • D3 fix — blast-radius control: mandatory --only-prefix dot-c1- include filter for surgical runs; backup denylist (.bak*,.stage0-frozen*,.orig,~,.swp,.tmp); and a --max-new (default 10) abort guard that refuses a bulk real-run (exit 3) unless --allow-bulk is given.

5. Mitigation PROVEN — patched logic run over real inputs (real exit codes)

Run Command (offline, no prod) GENUINE NEW exit Meaning
A (old) original grep logic 287 the bug, reproduced
B (new, no filter) --simulate-from … --dry-run 15 0 normalization+idempotency alone still inserts 15 unrelated backlog scripts ⇒ not surgical
C1 (new, C1 filter, today) --only-prefix dot-c1- --dry-run 0 0 no C1 on disk yet ⇒ nothing registered
C2 (new, C1 filter, post-W1) C1 scripts staged on bin 7 0 diff = exactly the 7 named DOT_C1_ rows, nothing else*

Run B breakdown (real stdout, evidence/runB.txt): disk 287 → skipped backup 78, skipped-by-basename 180, skipped-by-code 14, GENUINE NEW 15. Run C2 stdout lists exactly:

DOT_C1_VOCAB_BUILD, DOT_C1_VOCAB_VERIFY, DOT_C1_PREFLIGHT, DOT_C1_BAD_INPUT_HARNESS,
DOT_C1_EVIDENCE_READBACK, DOT_C1_ROLLBACK_CHECK, DOT_C1_CONTRACT_REGISTER

6. Hardening conclusion (the key new finding)

The prior plan assumed a normalization fix would make a bare registrar safe and otherwise used a manual POST. Run B disproves that: a corrected matcher still inserts 15 unrelated rows. Therefore:

  • Bare/normalized dot-dot-register is FORBIDDEN for C1 (blast radius 287 old / 15 even when fixed).
  • The only safe registration is the patched registrar with --only-prefix dot-c1- (+ the --max-new abort guard as a second net). Expected diff is then provably limited to the named DOT_C1_* rows (Run C2).

7. Acceptance vs macro §3.1

  • actual stored file_paths ✔ (§1) · actual scanned ✔ (§1) · exact matching logic causing false-new ✔ (§2) · why bare real-run forbidden ✔ (§6) · patch/mitigation design ✔ (§4) · dry-run after mitigation ✔ (§5, exit 0) · expected diff only DOT_C1 named rows ✔ (Run C2) · exit code ✔ · stdout summary ✔ (evidence/).
  • Not rejected: real-run would NOT insert 287 (filter), matcher no longer matches only raw file_path (basename+code), idempotency guard present (×2), mitigation does NOT depend on manual Directus edits (it IS the governed registrar). G1 = RESOLVED.

Caveat (honest): the mitigation dry-run is an offline run of the patched script's real matching logic over real live snapshots, not a live patched-registrar run — because running it live would require staging the patch + C1 scripts onto prod (a write, forbidden this turn). The logic, inputs, and exit codes are real; the final live confirmation is W1's dry-run at execution time.

Back to Knowledge Hub knowledge/dev/laws-new/reports/c1-lego-dryrun-plan-hardening-no-prod-write/02-registrar-path-normalization-defect-and-mitigation-proof-2026-06-22.md