KB-3863

FIX7 P0 Shaped-Clone CI-Gate — ci-seal-vs-bytes-gate-design.md

7 min read Revision 1
tool-kiem-thufix7p0production-shaped-cloneci-gate-packet2026-06-12

FIX7 P0 - CI seal-vs-bytes Gate Design (DESIGN ONLY, 2026-06-12)

Status: DESIGN ONLY. No CI was triggered. No production was wired. No secrets were touched. This document plus the reference checker ci_seal_vs_bytes_gate.py specify the gate that closes the design portion of blocker FIX7-P0-PROD-CI-SCOPE-1. Adoption and any wiring are owner/operator decisions to be done in a throwaway non-production branch first.

Blocker addressed (design portion only; blocker stays OPEN): FIX7-P0-PROD-CI-SCOPE-1

  • the FIX7 seal-vs-bytes CI gate was previously classified UNRESOLVABLE_READ_ONLY_BECAUSE_NOT_YET_DESIGNED. It is now designed.

1. Problem this gate solves

The FIX7 engineering identity is anchored by a byte-exact P7 pin: the canonicalizer rev3 body must hash to 49c386a9..b734d0 and be exactly 38756 UTF-8 bytes. The operative blueprint deliberately does NOT modify that file because any edit (even an in-document marker) would change its hash and break the pin. Today nothing in CI re-verifies that the on-disk bytes still match the pin, so silent drift (an editor reflow, a JSON re-serialization, an em-dash substitution) could diverge the deployed bytes from the sealed identity without any gate failing. The earlier ensure_ascii=False vs True em-dash byte issue (seen when reconciling local KB JSON bytes) is exactly this failure mode.

2. What must be sealed (the seal set)

# sealed artifact path (governed) expected sha256 expected bytes
1 canonicalizer rev3 body (P7-pinned) knowledge/dev/reports/architecture/t1-fix7-existing-system-refactor-execution-blueprint-2026-06-08/canonicalizer-fix7-canon-v1-ssot.md 49c386a9b9666c09786fc4f89bc79776b6046eaee6f4da6d8537d2c753b734d0 38756
2 hardened rollback validator hardened_dryrun_validator.py (this packet / its canonical home) e6547e6935cb01aae5feb405899c97107f1990ff3e2f7e6b9157828a90956c47 (length of that file)
3 operative birth blueprint (identity binder) knowledge/dev/reports/architecture/fix7-p0-operative-birth-blueprint-2026-06-12.md c1e23a30..84e9 (record on adoption) (record on adoption)

The seal set is a small, explicit allow-list. Adding/removing a sealed artifact is itself a reviewed change to the committed seal manifest.

3. Where the expected hash lives

A committed file seal-manifest.sha256 in the repo, one record per line:

<sha256>  <byte_len>  <relpath>

It is plain ASCII, comments with #. It is the single source of truth for the gate. Changing it requires a reviewed commit - that review IS the human approval for any intended identity change.

4. Table C - gate design

gate source bytes sealed bytes check command fail-closed behavior
seal:canonicalizer raw on-disk bytes of canonicalizer rev3 manifest record #1 (49c386a9.., 38756) python3 ci_seal_vs_bytes_gate.py --manifest seal-manifest.sha256 --root <repo> exit 1 -> block merge/deploy if sha OR byte length differ
seal:validator raw on-disk bytes of hardened validator manifest record #2 (e6547e69..) same command, same manifest exit 1 -> block if validator bytes drift
seal:operative-blueprint raw on-disk bytes of operative blueprint manifest record #3 same command exit 1 -> block if identity binder drifts

One command verifies the whole seal set; any single mismatch fails the whole job.

5. How the gate verifies (the rules that prevent drift)

  1. Read in binary mode. The gate open(path, "rb") and hashes the raw bytes. It never decode() -> re-encode() -> re-serialize() before hashing.
  2. Compare sha256 AND exact byte length. Length is a cheap second axis that catches truncation/padding even in the (cryptographically negligible) absence of a hash difference, and makes failures legible.
  3. Equal-logical-value is NOT equal-bytes. A JSON file re-dumped with ensure_ascii flipped, an em-dash (U+2014) where an ASCII hyphen was, a BOM, or CRLF endings all change the bytes and MUST fail. The reference checker's selftest proves each of these fails closed.
  4. Missing file fails. A sealed path that does not exist is a failure, not a skip.
  5. Empty/malformed manifest fails. An empty or malformed manifest is itself a fail-closed condition (the gate must never "pass" because it checked nothing).
  6. No normalization knob. The gate exposes no --normalize/--ignore-whitespace option. Normalization is the hole; it is intentionally absent.

6. Handling the Unicode / em-dash byte issue specifically

  • Sealed text artifacts are stored and compared as their exact UTF-8 bytes. If a file legitimately contains non-ASCII (em-dash, etc.), the sealed sha is computed over those exact bytes once, at seal time, and never re-derived from a re-serialization.
  • For JSON specifically: seal the on-disk file bytes, never a re-json.dumps output. The gate does not parse-then-reserialize JSON. This makes the ensure_ascii=False-vs-True distinction impossible to paper over.
  • Optional hardening (recommended at adoption): a pre-seal lint that flags non-ASCII bytes and BOM in files expected to be ASCII, so a stray em-dash is caught at authoring time, not only at gate time.

7. What failure blocks

The gate is a CI job that exits non-zero on any mismatch. Wired as a required status check, a non-zero exit blocks merge and blocks deploy. It never prints a PASS token on mismatch. It performs no mutation and contacts no production system - it only reads bytes and compares hashes.

8. Reference implementation and proof

ci_seal_vs_bytes_gate.py (this packet) implements the above. Its --selftest constructs a sealed file and proves the gate fails closed on: edited content, em-dash/unicode drift, ensure_ascii JSON re-encode drift, BOM prefix, CRLF drift, and missing file - and passes only on byte-identical content. Result is recorded in ci-seal-vs-bytes-gate-selftest-result.json.

9. Boundary

This is a design + reference checker only. It does not wire CI, does not run in any production pipeline, and does not by itself resolve FIX7-P0-PROD-CI-SCOPE-1, which stays OPEN pending the owner's decision to adopt and the operator's off-production wiring + review.

Back to Knowledge Hub knowledge/dev/reports/architecture/fix7-p0-production-shaped-clone-rehearsal-ci-gate-packet-2026-06-12/ci-seal-vs-bytes-gate-design.md