KB-6998
dot-iu-cutter v0.5 — Constitution Snapshot Artifact Spec (checksum-addressed path, write-once, content/metadata format, no-overwrite)
6 min read Revision 1
dot-iu-cutterv0.5constitution-fixturesource-snapshot-captureartifact-specchecksum-addressedwrite-oncedesign-onlyno-executiondieu442026-05-18
dot-iu-cutter v0.5 — Constitution Snapshot Artifact Spec
Phase:
v0_5_constitution_source_snapshot_capture_authoring· Date: 2026-05-18 · doc 2 of 5authority: GPT Q6 APPROVE_CHECKSUM_ADDRESSED_WRITE_ONCE_CONVENTION_WITH_REHASH_GATE spec_only: true ; no artifact written ; checksum_persisted_or_updated: NONE decision_authority: GPT / User ONLY ; self_advance: PROHIBITED
1. Path convention (checksum-addressed)
directory: knowledge/dev/laws/dieu44-trien-khai/snapshots/constitution/
filename: constitution-normalized-<checksum_prefix>.md
checksum_prefix: first 16 hex chars of the normalized_content_checksum (sha256)
full_checksum: the complete 64-hex sha256 is stored INSIDE the artifact metadata
example (capture-time value, KB-SSOT from Codex BLOCKED execution log):
normalized_content_checksum = 17660443e0f23e994e1807cf8e22920951a9e70c598956dbd0e752f4f5cae80c
-> path: knowledge/dev/laws/dieu44-trien-khai/snapshots/constitution/
constitution-normalized-17660443e0f23e99.md
rationale: prefix in filename = human-legible content address; full 64-hex in
metadata = exact identity; collision across distinct content at 16-hex is
cryptographically negligible AND further guarded by SC1 full-checksum compare.
2. Write-once / no-overwrite rule
no_overwrite_policy: true
on_capture_target_resolution:
path_absent: -> OK to write (in the later gated phase)
path_present_same_full_checksum: -> IDEMPOTENT: do NOT rewrite; reuse existing
artifact as-is (PASS, already pinned)
path_present_different_content: -> STOP_AND_ESCALATE (SC1). Never overwrite,
never truncate, never append.
immutability_model: by CONVENTION + rehash gate (KB docs are revisioned, not
physically content-addressed). The artifact, once written, is NEVER edited.
Any required status change (e.g. orphan/retired) is a SEPARATE metadata-note
doc, never a mutation of the artifact body (see doc 3 rollback design).
3. Artifact content format
The artifact is a single markdown file with a metadata header and a strictly delimited normalized-content region. The identity checksum is the sha256 of only the bytes between the delimiters (exactly the parser output), NOT of the whole file — so the rehash gate is unambiguous and equals source_document_version.content_checksum.
---
artifact_kind: constitution_normalized_snapshot
source_document_ref: incomex-constitution
source_url: https://vps.incomexsaigoncorp.vn/knowledge/dev/laws/constitution
captured_from: live_nuxt_rendered_page
captured_from_live_url: https://vps.incomexsaigoncorp.vn/knowledge/dev/laws/constitution
captured_at: <UTC ISO-8601, set at gated capture>
parser_profile_ref: nuxt-incomex-portal-constitution-v1
authoritative_span: candidate_B (H1 "HIẾN PHÁP" -> end CHANGELOG, excl backlink)
changelog_included: true
source_title: "HIẾN PHÁP KIẾN TRÚC HỆ THỐNG INCOMEX — v4.6.3 BAN HÀNH"
source_version_label: "v4.6.3 BAN HÀNH" # semantic label, NOT identity
normalized_content_checksum: <sha256 hex 64> # == identity ; == filename prefix source
normalized_content_length: <int, codepoint length>
marker_counts: { enacted: <n>, controlled_draft: <n>, draft: <n>, obsolete: <n> }
raw_fetch_checksum: <sha256 hex | null> # FORENSIC ONLY, never identity
raw_fetch_bytes: <int | null> # forensic only
ad_revision_observed: <int | null> # audit/lineage only (e.g. 44)
secrets: none
---
<<<BEGIN-NORMALIZED-CONTENT-DO-NOT-EDIT
{the exact normalized authoritative content bytes — the parser output}
END-NORMALIZED-CONTENT-DO-NOT-EDIT>>>
field_rules:
- normalized_content_checksum = sha256( bytes strictly between the BEGIN/END
sentinels, sentinel lines themselves EXCLUDED, no trailing newline added )
- normalized_content_length = codepoint count of that same byte region
- marker_counts = codepoint-exact counts (U+2705 ✅, U+1F4CB 📋, U+1F4DD 📝,
U+26D4 ⛔) within that region
- raw_fetch_checksum: include ONLY as forensic; MUST NOT equal / be used as
content_checksum (NEG rule, version-policy)
- NO secrets, tokens, cookies, internal IPs, or auth headers anywhere
- the metadata header is NOT part of the identity hash (only the delimited
content region is)
capture_time_expected (KB-SSOT, from Codex BLOCKED execution log — to be
RE-CONFIRMED at gated capture, not asserted as final here):
normalized_content_checksum ≈ 17660443e0f23e994e1807cf8e22920951a9e70c598956dbd0e752f4f5cae80c
normalized_content_length ≈ 17522
marker_counts = { enacted:19, controlled_draft:1, draft:1, obsolete:1 }
raw_fetch (forensic) ≈ 8d551f47… / 1205114 bytes
NOTE: prior f9d22d05…/17791 is LOST/unrecoverable & was never persisted; this
snapshot legitimately pins CURRENT content (new identity, no prior row to
supersede).
4. Statement
- Checksum-addressed path + write-once/no-overwrite + STOP_AND_ESCALATE-on-collision defined (QG2); strict content-region delimiter makes the rehash gate deterministic and equal to the version identity (supports QG3); document_version_id rule untouched here (QG4); no artifact written, nothing mutated (QG5).
- doc 2 of 5; STOP after 5 files → route GPT/User. Self-advance PROHIBITED.
Companions: operational-framing, capture-procedure-draft, seed-strategy-and-verification-plan, capture-authoring-report.