dot-iu-cutter v0.5 — Snapshot MARK Byte-safe Fixture Log (Option B: refimpl.r1 from live source, region bytes never through model context; region sha 17660443 EXACT; no hand-edit)
dot-iu-cutter v0.5 — Snapshot MARK Byte-safe Fixture Log
Phase:
v0_5_snapshot_MARK_byte_safe_fixture_full_CI· Nature:fixture_provision__byte_safe__no_commit__no_dryrun· Date: 2026-05-18 · doc 1 of 4fixture_generation_method: Option B — regenerated locally via refimpl.r1 from the LIVE source; region bytes flowed shell→file ONLY, never through model context; NO manual markdown copy; NO Unicode hand-edit. result: region sha256 == 17660443… EXACT (len 17522, markers 19/1/1/1)
1. Why Option B (A and C unavailable)
Option A (base64 blob): NO blob exists in KB (consolidated-path Step 2 was never
authorized/executed). Reading the KB pinned artifact via MCP is the corrupting
channel (prior region 86d6aea7 ≠ 17660443). A not possible.
Option C (byte-exact local copy): filesystem search under /Users/nmhuyen found
ONLY the prior NON-IDENTITY fixture (region 86d6aea7); no byte-exact copy
anywhere. C not possible.
Option B (regenerate via refimpl.r1): CHOSEN. refimpl.r1 is the KB-ratified
reference parser proven to reproduce the canonical 17660443…/17522/19·1·1·1
deterministically 3/3 from the live source (KB nuxt-parser test-result doc).
2. Byte-safety design (the key control)
The corruption mechanism is same-width Unicode substitution when dense content passes through the model context as text. Defeated by ensuring the identity region bytes never enter the model context:
1 live page fetched by curl → /tmp scratch raw.html (bytes, not shown)
2 refimpl.r1 written to scratch (algorithm + normative step ORDER verbatim;
only non-ASCII codepoints that are runtime-identical: MARKERS ✅/⛔ + the
"HIẾN PHÁP" H1-anchor regex — Python-equivalent, output sha-gated)
3 python3 refimpl.py raw.html region.txt → region.txt = normalized region
(b_text, no trailing \n per D-TRAILNL); written by the script, shell→file
4 fixture assembled file→file: frontmatter+BEGIN > fixture ; cat region.txt
>> fixture ; printf END >> fixture (region bytes never tokenized)
5 only sha256 / length / integer marker counts were ever surfaced to context
This is not "hand-editing Unicode to force a hash": the region is the deterministic output of the ratified parser over the live source, gated by an exact sha256. Worst case of any script transcription error = fail-closed (anchor miss / wrong markers / wrong sha) → STOP, never a wrong fixture.
3. Execution evidence
live_fetch: HTTP 200, redirects 0, raw_bytes 1,251,374,
raw_sha256 b7d04a43ec674b7a533d0d2aa982ed18b8408fd949f86f05c8882984b3f2aace
(raw is Nuxt-render-volatile / forensic-only by design)
parser_status: OK
parser_reported: checksum 17660443…cae80c · length 17522 ·
markers {enacted:19,controlled_draft:1,draft:1,obsolete:1} ·
A_minus_B 329 (span geometry consistent with KB 329 band)
independent_recompute (region.txt):
region_sha256 = 17660443e0f23e994e1807cf8e22920951a9e70c598956dbd0e752f4f5cae80c ✓ EXACT
region_codepoint_len = 17522 ✓
region_trailing_newline= False (D-TRAILNL honored)
region_is_NFC = True
region_markers = {enacted:19, controlled_draft:1, draft:1, obsolete:1} ✓
content_drift_since_pin: NONE — normalized region byte-identical to the pinned
canonical despite raw render noise (living-doc stable).
4. Fixture placement (+ flagged path-coupling discrepancy)
mandated_path (ruling): tests/fixtures/constitution-normalized-17660443e0f23e99.md → written
test_resolution_path: the RATIFIED, UNMODIFIED test file (sha 454d9fc8) resolves
ART = Path(__file__).resolve().parents[1] / "constitution-normalized-17660443e0f23e99.md"
= REPO ROOT, NOT tests/fixtures/. (verified on disk this phase)
conflict: ruling mandates tests/fixtures/ AND keep test unchanged AND full 21/21
— mutually inconsistent for the ratified test as written (it reads repo-root).
resolution_taken: provisioned the byte-exact fixture at BOTH paths, byte-identical
(file-level sha 5c76eedd… on both; copied file→file from the same region):
- tests/fixtures/constitution-normalized-17660443e0f23e99.md (canonical, per ruling)
- constitution-normalized-17660443e0f23e99.md (repo-root; the path the
ratified test actually consumes — required to satisfy "full 21/21")
test file NOT modified (454d9fc8 preserved); dryrun.py NOT modified.
Both untracked, no commit, fully reversible.
ROUTED TO GPT (pre-flagged in readiness-audit doc 3 §2.3): rule the canonical
coupling — (a) amend the test ART → tests/fixtures/ in a later gated phase
(changes test hash → re-ratify), or (b) declare repo-root canonical and keep
tests/fixtures/ as the stored copy. NOT self-decided here.
5. Gate verification (the REAL gate — dryrun.py)
D.extract_region + D.snapshot_gate(region, 17660443…, 17522,
{enacted:19,controlled_draft:1,draft:1,obsolete:1}) — raises FailClosed on
ANY mismatch — returned GATE_PASS for BOTH files:
tests/fixtures/… => 17660443e0f23e99 17522 {19,1,1,1}
repo-root … => 17660443e0f23e99 17522 {19,1,1,1}
Scratch (/tmp/iucut-bytesafe.*: raw.html, refimpl.py, region.txt, fixture.md)
shredded after assembly+verification. No secrets recorded.
doc 1 of 4. Self-advance PROHIBITED.