KB-6014

dot-iu-cutter v0.5 — Constitution Snapshot Capture Procedure Draft + Controlled-Execution Outline + Rollback/Compensation (NOT executed)

7 min read Revision 1
dot-iu-cutterv0.5constitution-fixturesource-snapshot-capturecapture-procedurecommand-review-outlinerollback-compensationdesign-onlyno-executiondieu442026-05-18

dot-iu-cutter v0.5 — Constitution Snapshot Capture Procedure Draft + Controlled-Execution Outline + Rollback

Phase: v0_5_constitution_source_snapshot_capture_authoring · Date: 2026-05-18 · doc 3 of 5

nature: PROCEDURE DRAFT — NOT executed. No fetch-for-capture, no write, no seed.
rehash_before_seed/dryrun/cut: REQUIRED (GPT Q6) ; checksum_persisted: NONE
decision_authority: GPT / User ONLY ; self_advance: PROHIBITED

1. Read-only capture procedure (authored; execution gated to a later phase)

CP-1 fetch_live (read-only):
  GET https://vps.incomexsaigoncorp.vn/knowledge/dev/laws/constitution  x3
  assert HTTP 200 ; record raw bytes + raw sha256 (FORENSIC ONLY) per fetch
  STOP (SC6) if any non-200 / unreachable
CP-2 extract+normalize (parser_profile nuxt-incomex-portal-constitution-v1):
  first server-rendered <article>; drop <script>/<style> subtrees;
  N1 HTML-unescape; N2 NFC; N3 CRLF/CR->LF; N4 collapse [ \t\f\v]+ -> space;
  N5 trim line ends; N6 collapse blank-line runs;
  authoritative_span = candidate_B (H1 "HIẾN PHÁP" -> end CHANGELOG, EXCL
  trailing "Back to Knowledge Hub" backlink); CHANGELOG INCLUDED (R-CL1)
CP-3 measure:
  normalized_content_checksum = sha256(normalized content bytes)
  normalized_content_length   = codepoint count
  marker_counts               = codepoint-exact ✅/📋/📝/⛔
  raw drift across the 3 fetches is EXPECTED & FORENSIC-ONLY (B6 CLOSED);
  normalized result MUST be identical across the 3 fetches else STOP (SC re-fetch)
CP-4 derive target path (no write):
  checksum_prefix = first16(normalized_content_checksum)
  target = knowledge/dev/laws/dieu44-trien-khai/snapshots/constitution/
           constitution-normalized-<checksum_prefix>.md
CP-5 collision precheck (read-only):
  path absent                    -> proceed to CP-6
  path present & same full cksum -> IDEMPOTENT: reuse; skip CP-6; go CP-7
  path present & different cksum -> STOP_AND_ESCALATE (SC1) — never overwrite
CP-6 write_artifact  [GATED — later approved execution ONLY; NOT now]:
  write the doc-2 format (metadata header + delimited normalized-content region)
  write-once; if the write tool would overwrite an existing path -> abort (SC1)
CP-7 rehash gate (immediately after write / on reuse):
  re-read the artifact; extract ONLY the bytes between the BEGIN/END sentinels;
  recompute sha256 + length + marker counts;
  assert recomputed_checksum == metadata.normalized_content_checksum
         == filename checksum_prefix source == CP-3 computed checksum
  assert recomputed length & marker_counts == metadata
  mismatch -> BLOCKED (SC2) ; artifact untrustworthy ; do NOT proceed to seed
CP-8 report: PASS (artifact pinned, identity = checksum) | BLOCKED (reason)
  NO seed in the capture procedure. STOP -> route to seed command-review.

Not executed in this phase. CP-1..CP-8 are authored for a later, separately authorized capture execution. This phase performs none of them.

2. Future controlled-execution outline (command-review package; not run)

Recommended two separately-gated executions so no seed DML is ever authored against an unknown checksum:

E1 capture+freeze (gated execution #1):
  C-01 read-only PRE: system_identifier match; cutter_governance + registry
       tables exist; source_family internal_incomex_constitution + grammar_profile
       incomex-architecture-constitution-v4 present; ZERO source_document/version
       rows for incomex-constitution (else SC5 STOP); NO snapshot path collision
  C-02 run CP-1..CP-3 (read-only capture+measure)
  C-03 run CP-4..CP-6 (derive path, collision precheck, write-once artifact)
  C-04 run CP-7 rehash gate ; BLOCKED on mismatch
  C-05 report; STOP -> the now-known snapshot checksum becomes the input to
       the seed DML authoring/command-review
E2 seed (gated execution #2, AFTER GPT command-review of the seed DML that is
   now bound to the known snapshot checksum):
  C-06 read-only PRE re-confirm (rows still 0/0; artifact still present;
       rehash CP-7 PASS again — REQUIRED before seed)
  C-07 execute atomic source_document + source_document_version seed
       (one BEGIN/COMMIT, INSERT-only, content_checksum := snapshot checksum)
  C-08 POST/NEG/CK verification (doc 4) ; on any FAIL -> rollback (§3)
gate_discipline: E1 and E2 each require explicit GPT/User authorization;
  no self-advance; report every phase.

3. Rollback / compensation design

R1 capture_fails_before_write (CP-1..CP-5 fail):
   nothing written -> NO rollback required ; report BLOCKED + reason
R2 artifact_written_but_seed_not_executed:
   artifact is a valid WRITE-ONCE orphan candidate; it is harmless & immutable.
   do NOT delete, do NOT overwrite. If it must be marked, add a SEPARATE KB
   metadata-note doc (e.g. status: orphan_candidate / retired_note) ONLY if KB
   supports a side note; the artifact body is NEVER mutated. Physical delete
   ONLY on explicit GPT/User approval.
R3 rehash_gate_fails (SC2):
   treat the just-written artifact as quarantined-untrusted; do NOT seed;
   do NOT overwrite it; escalate. A corrected capture writes a NEW
   checksum-addressed path (different content -> different name), never the same.
R4 seed_later_fails (in E2):
   the source-seed rollback is the SEPARATE child-first, no-CASCADE DELETE path
   inherited from the seed-authoring-restart package; it touches ONLY the seed
   rows. The snapshot artifact is independent and is left untouched.
R5 invariant: NEVER overwrite or delete a snapshot artifact as part of any
   rollback/compensation. Compensation = metadata note, not content mutation.

4. Statement

  • Exact read-only capture procedure CP-1..CP-8 with the rehash gate REQUIRED before any seed/dry-run/cut (QG3); two-gate controlled-execution outline recommended (QG6); rollback/compensation never overwrites/deletes the artifact. Option B direction (QG1). Nothing fetched-for-capture, written, seeded, or mutated (QG5).
  • doc 3 of 5; STOP after 5 files → route GPT/User. Self-advance PROHIBITED.

Companions: operational-framing, artifact-spec, seed-strategy-and-verification-plan, capture-authoring-report.

Back to Knowledge Hub knowledge/dev/laws/dieu44-trien-khai/v0.5-constitution-source-snapshot-capture-authoring/dot-iu-cutter-v0.5-constitution-source-snapshot-capture-procedure-draft-2026-05-18.md