dot-iu-cutter v0.5 — Constitution Nuxt Parser/Checksum: Operations-First Framing ("Cắt Hiến pháp" drift safety)
dot-iu-cutter v0.5 — Constitution Nuxt Parser/Checksum: Operations-First Framing
Phase:
v0_5_constitution_nuxt_parser_checksum_ratification· Nature:read_only_grounding_plus_design_authoring__no_seed_no_dryrun· Date: 2026-05-18 · doc 1 of 5design_order: operations_first (per current-operating-objectives-and-principles 2026-05-18 §3) dml: none ; source_seed: none ; dry_run: none ; cut: none ; verify: none decision_authority: GPT / User ONLY ; self_advance: PROHIBITED
This package opens the focused phase GPT required in the source-document-seed-authoring review: prove deterministic capture/checksum of the Nuxt-rendered Constitution so B6 can be ruled. Operator workflow and exception model are stated first; parser/checksum rules are derived from them (docs 3–4), not led with.
1. The single operational question this phase must answer
When the user says "Cắt Hiến pháp", can the system reliably know whether the
source content changed since the registered version — without a human guessing?
content_checksum on source_document_version_registry is NOT NULL and is the only persisted version identity (no raw_checksum / parser_profile_ref columns live — MISMATCH-5). So the operator's safety depends entirely on whether that one value is deterministic and drift-sensitive. This is not a technical nicety; it is the drift-detection guarantee.
2. What the checksum/parser must guarantee — in operator terms
G1 same_content_same_id:
two fetches of the SAME authoritative content -> SAME content_checksum
(even if Nuxt re-renders the page, cache-control: no-cache, bytes churn)
G2 changed_content_new_id:
any real change to Constitution content -> DIFFERENT content_checksum
-> a NEW source_document_version row (visible, gated supersession) — never a silent re-cut
G3 render_noise_ignored:
Nuxt chrome/hydration/script/breadcrumb/footer volatility MUST NOT change content_checksum
(otherwise every "Cắt Hiến pháp" would false-alarm as drift)
G4 markers_preserved_exactly:
✅ 📋 📝 ⛔ survive normalization codepoint-exact — status classification (enacted_only)
must never be corrupted by the parser
G5 reproducible_offline:
the checksum is recomputable by any future run from a single controlled GET +
the ratified profile, with no human judgement
If G1–G5 hold, Cắt Hiến pháp can be a one-command, exception-only operation. If they do not, the system must STOP rather than cut against an unpinned/guessed version (operational hazard E2 from the seed operational-framing doc).
3. PASS / FAIL / BLOCKED behaviour for a future "Cắt Hiến pháp"
PASS (normal, no exception):
re-fetch -> strip Nuxt chrome -> normalize -> content_checksum == registered
-> proceed to gated enacted_only cut -> "PASS: N IU cut, M deferred, 0 errors. Report: <KB link>"
human_attention: none
FAIL_DRIFT (content changed — expected over a living document):
recomputed content_checksum != registered checksum
-> system does NOT cut the old version; proposes a NEW source_document_version row
(supersedes chain) and routes to review
operator sees: "BLOCKED: Hiến pháp content changed since registered version
v<old>; new version proposed, cut withheld pending review."
BLOCKED_NONDETERMINISTIC (the current B6 condition):
the ratified normalization profile cannot produce a repeatable checksum
(no stable authoritative span / parser not ratified)
-> system MUST NOT cut and MUST NOT fabricate a checksum
operator sees: "BLOCKED: cannot pin Hiến pháp version deterministically —
parser/checksum profile not ratified. No cut performed."
The defining rule: a non-deterministic or guessed checksum is treated as BLOCKED, never as PASS. Fail-closed.
4. What must be reported to the operator if checksum drift is detected
operator_report_on_drift (concise, exception-only):
- one line: "Hiến pháp source changed since registered version"
- registered vs recomputed content_checksum (short form)
- retrieval timestamp + source_url
- the safe default taken: "cut withheld; new version row proposed; routed to review"
- KB link to the full technical diff/provenance (NOT forced inline)
never:
- silently re-cut on changed content
- silently overwrite the registered version
- surface raw Nuxt byte churn as if it were content drift (that is G3 noise, not drift)
5. Why raw bytes cannot be the operator's drift signal (grounded this phase)
A read-only triple GET this session (doc 2) showed the raw page is volatile: two fetches identical, a third differed by ~860 KB with a different Content-Length, under cache-control: no-cache + x-powered-by: Nuxt. If the operator's drift signal were the raw hash, every Cắt Hiến pháp would false-fire G2/FAIL_DRIFT. Therefore the operator-facing identity must be the normalized authoritative-content checksum, and raw hash is forensic-only. Doc 2 also shows the normalized content checksum was stable across all three fetches — the mechanism that makes G1/G3 achievable.
6. Statement
- Operations-first (QG1): the parser/checksum exists solely to make
Cắt Hiến phápdrift-safe (G1–G5); PASS/FAIL_DRIFT/BLOCKED defined; operator drift report defined. - No DML, no source seed, no dry-run, no cut/verify. doc 1 of 5; STOP after package → route GPT/User. Self-advance PROHIBITED.
Companion: source-grounding-and-repeatability, authoritative-extraction-and-normalization-design, parser-profile-and-ruling-request, ratification-report.