KB-6998

dot-iu-cutter v0.5 — Constitution Snapshot Artifact Spec (checksum-addressed path, write-once, content/metadata format, no-overwrite)

6 min read Revision 1
dot-iu-cutterv0.5constitution-fixturesource-snapshot-captureartifact-specchecksum-addressedwrite-oncedesign-onlyno-executiondieu442026-05-18

dot-iu-cutter v0.5 — Constitution Snapshot Artifact Spec

Phase: v0_5_constitution_source_snapshot_capture_authoring · Date: 2026-05-18 · doc 2 of 5

authority: GPT Q6 APPROVE_CHECKSUM_ADDRESSED_WRITE_ONCE_CONVENTION_WITH_REHASH_GATE
spec_only: true ; no artifact written ; checksum_persisted_or_updated: NONE
decision_authority: GPT / User ONLY ; self_advance: PROHIBITED

1. Path convention (checksum-addressed)

directory: knowledge/dev/laws/dieu44-trien-khai/snapshots/constitution/
filename:  constitution-normalized-<checksum_prefix>.md
checksum_prefix: first 16 hex chars of the normalized_content_checksum (sha256)
full_checksum: the complete 64-hex sha256 is stored INSIDE the artifact metadata
example (capture-time value, KB-SSOT from Codex BLOCKED execution log):
  normalized_content_checksum = 17660443e0f23e994e1807cf8e22920951a9e70c598956dbd0e752f4f5cae80c
  -> path: knowledge/dev/laws/dieu44-trien-khai/snapshots/constitution/
           constitution-normalized-17660443e0f23e99.md
rationale: prefix in filename = human-legible content address; full 64-hex in
  metadata = exact identity; collision across distinct content at 16-hex is
  cryptographically negligible AND further guarded by SC1 full-checksum compare.

2. Write-once / no-overwrite rule

no_overwrite_policy: true
on_capture_target_resolution:
  path_absent:                     -> OK to write (in the later gated phase)
  path_present_same_full_checksum: -> IDEMPOTENT: do NOT rewrite; reuse existing
                                      artifact as-is (PASS, already pinned)
  path_present_different_content:  -> STOP_AND_ESCALATE (SC1). Never overwrite,
                                      never truncate, never append.
immutability_model: by CONVENTION + rehash gate (KB docs are revisioned, not
  physically content-addressed). The artifact, once written, is NEVER edited.
  Any required status change (e.g. orphan/retired) is a SEPARATE metadata-note
  doc, never a mutation of the artifact body (see doc 3 rollback design).

3. Artifact content format

The artifact is a single markdown file with a metadata header and a strictly delimited normalized-content region. The identity checksum is the sha256 of only the bytes between the delimiters (exactly the parser output), NOT of the whole file — so the rehash gate is unambiguous and equals source_document_version.content_checksum.

---
artifact_kind: constitution_normalized_snapshot
source_document_ref: incomex-constitution
source_url: https://vps.incomexsaigoncorp.vn/knowledge/dev/laws/constitution
captured_from: live_nuxt_rendered_page
captured_from_live_url: https://vps.incomexsaigoncorp.vn/knowledge/dev/laws/constitution
captured_at: <UTC ISO-8601, set at gated capture>
parser_profile_ref: nuxt-incomex-portal-constitution-v1
authoritative_span: candidate_B (H1 "HIẾN PHÁP" -> end CHANGELOG, excl backlink)
changelog_included: true
source_title: "HIẾN PHÁP KIẾN TRÚC HỆ THỐNG INCOMEX — v4.6.3 BAN HÀNH"
source_version_label: "v4.6.3 BAN HÀNH"          # semantic label, NOT identity
normalized_content_checksum: <sha256 hex 64>     # == identity ; == filename prefix source
normalized_content_length: <int, codepoint length>
marker_counts: { enacted: <n>, controlled_draft: <n>, draft: <n>, obsolete: <n> }
raw_fetch_checksum: <sha256 hex | null>          # FORENSIC ONLY, never identity
raw_fetch_bytes: <int | null>                    # forensic only
ad_revision_observed: <int | null>               # audit/lineage only (e.g. 44)
secrets: none
---

<<<BEGIN-NORMALIZED-CONTENT-DO-NOT-EDIT
{the exact normalized authoritative content bytes — the parser output}
END-NORMALIZED-CONTENT-DO-NOT-EDIT>>>
field_rules:
  - normalized_content_checksum = sha256( bytes strictly between the BEGIN/END
    sentinels, sentinel lines themselves EXCLUDED, no trailing newline added )
  - normalized_content_length = codepoint count of that same byte region
  - marker_counts = codepoint-exact counts (U+2705 ✅, U+1F4CB 📋, U+1F4DD 📝,
    U+26D4 ⛔) within that region
  - raw_fetch_checksum: include ONLY as forensic; MUST NOT equal / be used as
    content_checksum (NEG rule, version-policy)
  - NO secrets, tokens, cookies, internal IPs, or auth headers anywhere
  - the metadata header is NOT part of the identity hash (only the delimited
    content region is)
capture_time_expected (KB-SSOT, from Codex BLOCKED execution log — to be
  RE-CONFIRMED at gated capture, not asserted as final here):
  normalized_content_checksum ≈ 17660443e0f23e994e1807cf8e22920951a9e70c598956dbd0e752f4f5cae80c
  normalized_content_length   ≈ 17522
  marker_counts               = { enacted:19, controlled_draft:1, draft:1, obsolete:1 }
  raw_fetch (forensic)        ≈ 8d551f47…  / 1205114 bytes
  NOTE: prior f9d22d05…/17791 is LOST/unrecoverable & was never persisted; this
  snapshot legitimately pins CURRENT content (new identity, no prior row to
  supersede).

4. Statement

  • Checksum-addressed path + write-once/no-overwrite + STOP_AND_ESCALATE-on-collision defined (QG2); strict content-region delimiter makes the rehash gate deterministic and equal to the version identity (supports QG3); document_version_id rule untouched here (QG4); no artifact written, nothing mutated (QG5).
  • doc 2 of 5; STOP after 5 files → route GPT/User. Self-advance PROHIBITED.

Companions: operational-framing, capture-procedure-draft, seed-strategy-and-verification-plan, capture-authoring-report.

Back to Knowledge Hub knowledge/dev/laws/dieu44-trien-khai/v0.5-constitution-source-snapshot-capture-authoring/dot-iu-cutter-v0.5-constitution-source-snapshot-artifact-spec-2026-05-18.md