dot-iu-cutter v0.5 — Test+Fixture Mismatch Analysis (test file ≡ KB doc-2 verbatim ⇒ ratify 454d9fc8; 31143968 not recoverable; fixture corrupted by transport ⇒ strategy A base64 blob)
dot-iu-cutter v0.5 — Test File & Fixture Mismatch Analysis
Phase:
…_implementation_readiness_audit· Nature:analysis_only· Date: 2026-05-18 · doc 3 of 5 · satisfies QG3 (test-hash decision) + QG4 (fixture strategy)
1. Test file mismatch (tests/test_dryrun_snapshot_mark.py)
current_sha256: 454d9fc84e940fdcf9da10bf29d12c5c420e21b1147ccc8da6a29a81f2843a4a
KB_declared: 31143968f322433cc5da62fa3ccf2a1fbe1905f461940c789a57cb0a116dc1b4
1.1 Is current 454d9fc8 semantically equivalent to KB doc-2?
YES. Prior phase performed a Read-level side-by-side of the repo file (219
lines) vs the KB doc-2 verbatim python block: line-for-line identical —
same docstring, imports, ART/SHA/EM constants, _Args, TestGate(5),
TestManifest(9), TestFailClosedSynthetic(4), TestNoDbImportIsolation(3);
identical assertions, synthetic fixtures, Vietnamese/em-dash/emoji literals. No
logic edit, no added/removed test. File hygiene: pure LF, no trailing ws, no
tabs, single final newline, UTF-8, is_NFC. Corroborated this audit: the same
KB doc-2 also yields cutter_agent/dryrun.py byte-exact (f1f42e83…),
and 7/7 fixture-independent tests PASS against that byte-exact module — the
test suite is the authored suite, exercising correct semantics.
1.2 Is 31143968 recoverable?
NO. Bounded deterministic experiment (no character guessing) over the
faithfully transcribed KB block — as-is, ±final newline, per-line rstrip,
CRLF, strip-all-trailing-newlines — produced 454d9fc8, 3c268b36,
0c237699, 6da558f7; none == 31143968. 31143968 was computed on a
pre-embed scratch copy whose exact blank-line layout was normalized when the
code-authoring agent embedded the block into KB markdown (same transport class
that left dryrun.py — ASCII/LF-dominant — unaffected but cost the test file its
declared hash). 31143968 is not byte-recoverable from the KB.
1.3 Recommendation — TEST HASH RULING
recommend: GPT/User RATIFY 454d9fc8…f2843a4a as the new canonical
hash-of-record for tests/test_dryrun_snapshot_mark.py.
basis: GPT's own "OPT_2 conditionally_allowed_after_side_by_side_review";
side-by-side DONE (≡ verbatim); dryrun.py byte-exact; 7/7 pass;
zero semantic divergence; 31143968 unrecoverable.
action_on_ratify: update KB hash-of-record 31143968 → 454d9fc8; thereafter
ALSO publish the test file as a base64 blob (doc 2 / standard S2)
so the ratified bytes are durably re-applyable without re-drift.
2. Fixture transport (pinned snapshot)
2.1 Why markdown/MCP transport changes the snapshot bytes
The pinned artifact was byte-correct when written (capture-phase CP-7 rehash PASSED: region sha == 17660443…). Corruption is in the read-back / re-author channel, not the stored identity:
mechanism: MCP delivers KB markdown as text → context → Write. The normalized
region is dense unicode (U+2192 ×65, U+2014 ×42, U+2705 ×19, U+1F4CB/U+1F4DD/
U+26D4, §, ·). At least 3× U+2013 EN-DASH appear where the pinned identity
implies a different equal-width character. Because substitutions are 1:1 same
codepoint count, region_length (17522) and marker census (19/1/1/1) stay
invariant while sha256 changes (86d6aea7 ≠ 17660443).
not_the_cause: NOT Unicode normalization (NFC/NFD/NFKC/NFKD all fail to
recover 17660443; region already is_NFC). NOT a source/markers/span drift.
NOT a code defect (dryrun.extract_region is correct; proven on synthetic).
class: lossy same-width codepoint substitution by the text transport channel
for whitespace-/codepoint-significant content (doc 2 root cause).
2.2 Strategy options
A store fixture as base64 blob in KB, decode into repo (gated):
+ ASCII payload immune to ws-normalization & unicode substitution
+ deterministic; region rehash provable == 17660443 BEFORE any test trusts it
+ no network, no live-page dependency, no refimpl re-run
+ matches GPT R2/R3 and doc-2 transport standard exactly
- one-time: a byte-faithful base64 of the pinned artifact must be produced by
a byte-trusted path (the capture environment / a tool that base64-encodes
the on-disk pinned file, not retyped text)
B store fixture compressed+base64 (gzip|b64):
+ smaller payload - same trust requirement as A + extra codec surface;
marginal benefit (artifact ~21 KB); A simpler/auditable. Not preferred.
C regenerate fixture locally from refimpl.r1, verify canonical hash:
+ reconstructs from first principles
- requires live Nuxt page fetch (network) + refimpl execution; page is a
LIVING doc with prior observed KB-revision drift; "v4.6.3" stable but raw
is render-volatile; reintroduces the exact divergence risk that 17660443
was pinned to freeze. Heaviest, most failure modes. Fallback only.
D do not store full fixture — use a generated/synthetic fixture in tests:
+ zero transport of the dense artifact
- the WHOLE POINT of TestGate/TestManifest is identity over the REAL pinned
region (sha==17660443, the 15/3/42 + Đ44 cascade). Synthetic already
covered by TestFailClosedSynthetic. D would delete real-snapshot coverage.
Rejected for the identity suite (acceptable only as an explicitly-labelled
interim, see doc 4 CI standard).
2.3 Recommendation — FIXTURE STRATEGY RULING
choose: A — base64 content-addressed blob in KB, gated decode into the repo
test path, with mandatory post-decode region rehash == 17660443…
(len 17522, markers 19/1/1/1) else STOP_AND_REPORT.
tradeoff_accepted: requires one byte-trusted base64 production of the pinned
on-disk artifact (NOT retyped) — a one-time, auditable, deterministic
cost that permanently removes the recurring transport blocker. C kept
as documented fallback ONLY if a byte-trusted base64 cannot be produced.
fixture_repo_path (when later applied): the test resolves
ART = Path(__file__).resolve().parents[1] / "constitution-normalized-17660443e0f23e99.md"
⇒ the byte-identity fixture must land at REPO ROOT (not tests/fixtures/), OR
the ratified test file's ART path be part of the ratified bytes. Flag for
GPT: decide root-path vs tests/fixtures/ + matching ART (coupled to §1.3).
current_nonidentity_fixture: quarantined at tests/fixtures/ (region 86d6aea7);
NOT wired, root path absent ⇒ no false PASS. Leave as-is until ruling.
doc 3 of 5. Self-advance PROHIBITED.