KB-5E6F
dot-iu-cutter v0.5 — Constitution Snapshot-source MARK: Manifest Contract + Allowed Outputs (leaf=DIEU; enacted-only candidates; excluded itemised; artifact-only)
12 min read Revision 1
dot-iu-cutterv0.5constitution-fixturesnapshot-source-markmanifest-contractcanonical-addressallowed-outputsartifact-onlydesign-onlydieu442026-05-18
dot-iu-cutter v0.5 — Constitution Snapshot-source MARK: Manifest Contract + Allowed Outputs
Phase:
v0_5_constitution_snapshot_source_MARK_dryrun_entrypoint_design· Nature:manifest_schema_and_output_contract_design_only__no_DB_write· Date: 2026-05-18 · doc 3 of 5nothing_executed: true ; no manifest produced ; this is the SCHEMA the future authorized run must emit, not a manifest leaf_floor: DIEU (OD-G2) ; scope: enacted_only ; output: ARTIFACT-ONLY (no DB write) decision_authority: GPT / User ONLY ; self_advance: PROHIBITED
1. Manifest object model (QG5 — artifact-only)
The entrypoint emits ONE manifest = an ordered list of unit records plus a header. A unit is either a candidate (effective_status=enacted, in scope) or an excluded record (controlled_draft / draft / obsolete) or a non-content classification (boilerplate / container heading). Every byte of the snapshot region belongs to exactly one record (coverage closure, doc 2 §3.3).
1.1 Manifest header
manifest_header:
generated_for: incomex-constitution
source_document_version_id: icxconst-008a06ace23a96ea6cd456146e805c97
snapshot_artifact_path: knowledge/dev/laws/dieu44-trien-khai/snapshots/constitution/constitution-normalized-17660443e0f23e99.md
snapshot_region_sha256: 17660443e0f23e994e1807cf8e22920951a9e70c598956dbd0e752f4f5cae80c
snapshot_region_length: 17522
marker_census_observed: { enacted:19, controlled_draft:1, draft:1, obsolete:1 }
grammar_profile: incomex-architecture-constitution-v4
address_template: at.icx.const.v4
parser_reference_implementation: nuxt-incomex-portal-constitution-v1.refimpl.r1
docprefix: ICX-CONST
scope_policy: enacted_only
mode: mark-manifest-only
db_write: NONE
candidate_count: <integer, MUST fall in [55,78] guardrail — doc 4 V-11>
excluded_count: <integer>
noncontent_count: <integer>
manifest_digest_sha256: <hash of the canonicalized unit list — determinism>
1.2 Candidate unit record (effective_status=enacted ONLY)
candidate_unit:
manifest_id: stable within-run id (e.g. CU-0001), assignment deterministic
by document order
level: NGUYEN_TAC | KIEN_TRUC_SECTION | DIEU (3 ratified levels)
unit_kind: principle | architecture_section | dieu_law | pointer_row
number: arabic/letter/compound id token (e.g. 12 | A | 0-S/M/L | 33)
or "—" for the terminology pointer row ; null for unnumbered
canonical_address: "ICX-CONST/<path>" (§2 — status NEVER encoded)
heading: the heading/label line as it appears in the snapshot
title: the unit's human title (principle name / section title / Tên)
parent: canonical_address of the container (or null for top-level)
normalized_text: the exact normalized substring of the unit body (verbatim
from the snapshot region; no re-rendering)
source_span: { byte_start, byte_end, char_start, char_end,
span_sha256 = sha256(exact substring) }
status_marker_observed: ✅ | (inherited:✅ via tier_0/tier_1) | none
effective_status: enacted (candidates are enacted by definition)
status_basis: tier_0_document_promulgation | tier_1_group_header
| tier_2_explicit_row_marker (doc 2 §3.1 — auditable)
cut_reason: heading_boundary | independent_principle
| independent_section | catalog_law_entry | pointer_reference
provenance: { source_document_version_id, snapshot_artifact_path,
snapshot_region_sha256, parser_reference_implementation,
grammar_profile }
confidence: high | lower(pointer_row / container-vs-leaf REVIEW flag)
1.3 Excluded record (controlled_draft / draft / obsolete — itemised, never dropped)
excluded_unit:
level: DIEU (or group container)
number / heading / title / source_span / span_sha256 (same span discipline)
status_marker_observed: 📋 | 📝 | ⛔ (or inherited group marker)
effective_status: controlled_draft | draft | obsolete
status_basis: tier_1_group_header | tier_2_explicit_row_marker
exclusion_reason: controlled_draft_deferred (Điều 44)
| draft_excluded_by_enacted_only (Điều 34)
| obsolete_excluded (Luật Luồng DL v1.1 ; Hiến pháp v3.9)
emitted_as: EXCLUDED (appears in manifest; not a candidate; not silently dropped)
1.4 Non-content classification record
noncontent_unit:
body_source_policy: EXCLUDED_BOILERPLATE | CONTAINER_HEADING
zone: Z1 | Z2-colhdr | Z3-lesson | Z4 | Z5-colhdr | Z5-group-hdr | Z6
source_span / span_sha256
examples_from_snapshot:
EXCLUDED_BOILERPLATE : H1 title, "Văn bản tối cao…", "Ban hành: S148…",
"v4.6.3 BAN HÀNH. Giữ nguyên…", Z4 "TUYÊN NGÔN | … | HẠ TẦNG"
& "THUẬT NGỮ" pointers, the two "→ " NGUYEN_TAC pointers,
the entire CHANGELOG block + final "HP v4.6.3 BAN HÀNH | …"
CONTAINER_HEADING : zone-entry headers (Z2/Z3/Z5/Z6), catalog group headers
("Nền tảng — ✅" …), column-header lines
(#/Nguyên tắc/Nghĩa/Hệ quả ; Điều/Tên/File ; Version/Nội dung),
the 4-DB table scaffold inside section A
note: container headings may parent candidate units; they are themselves NOT IUs.
2. canonical_address format (BR-A1; status excluded from address)
scheme (locked, address_template at.icx.const.v4):
"<DOCPREFIX>/<path>" with docprefix_separator='/' level_separator='-'
encodes_status = false (status marker NEVER enters the address — metadata only)
docprefix: ICX-CONST (UNIQUE in source_document_registry; cannot collide with
legacy D38-DIEU28/32/35 addresses)
path_rules (deterministic):
NGUYEN_TAC #n -> ICX-CONST/NT-<n> e.g. ICX-CONST/NT-12
KIEN_TRUC_SECTION <X> -> ICX-CONST/KT-<X> e.g. ICX-CONST/KT-A
DIEU <id> -> ICX-CONST/DIEU-<id> e.g. ICX-CONST/DIEU-33
DIEU compound id -> ICX-CONST/DIEU-0-B , ICX-CONST/DIEU-0-S-M-L
(the '/' in "0-S/M/L" is normalized to '-' for the
address; the original id token is preserved in `number`)
terminology pointer "—" -> ICX-CONST/DIEU-TERMINOLOGY (deterministic synthetic
slug for the unnumbered pointer row; flagged confidence=lower)
illustrative_addresses (NOT generated/written — design only):
Nguyên tắc 12 -> ICX-CONST/NT-12
Kiến trúc section A -> ICX-CONST/KT-A
Điều 33 (PostgreSQL) -> ICX-CONST/DIEU-33 (status ✅ NOT in address)
Điều 44 (controlled_draft) -> ICX-CONST/DIEU-44 (EXCLUDED; address still well-formed,
status 📋 is metadata only)
uniqueness: every emitted unit (candidate AND excluded) MUST have a unique
canonical_address; a duplicate => BLOCKED address-collision (doc 2 §5).
3. Determinism contract
deterministic_inputs: (pinned snapshot region bytes) + (grammar_profile
incomex-architecture-constitution-v4) + (refimpl.r1 semantics) + (4-marker map)
determinism_requirement:
- manifest_id assignment = pure function of document order
- unit ordering = document order of source_span.byte_start
- manifest_digest_sha256 = sha256 of the canonicalized (sorted, normalized) unit
list; re-running on the SAME inputs MUST yield a byte-identical digest
- NO timestamps, NO RNG, NO environment-dependent value inside the hashed manifest
body (capture/run timestamps live OUTSIDE the digested region, in the report)
4. Allowed outputs (QG5; task §6)
primary_output_folder (KB):
knowledge/dev/laws/dieu44-trien-khai/v0.5-constitution-first-dryrun/
allowed_files (exactly these five artifact classes, produced ONLY by a FUTURE
separately-authorized run — NOT produced in this design phase):
- …-manifest-2026-..md|.json # the candidate+excluded+noncontent manifest
- …-review-evaluation-2026-..md # coverage / no-overlap / vocab / hierarchy eval
- …-coverage-proof-2026-..md # span reconciliation over the snapshot region
- …-determinism-digest-2026-..md # manifest_digest_sha256 + re-run equality proof
- …-dryrun-report-2026-..md # PASS / FAIL / BLOCKED + one-line operator view
large_output_handling:
if the manifest is too large for KB, the future run writes it to ephemeral
throwaway scratch ($WD/manifest, mode 0700, never git-added, shredded at teardown)
and uploads ONLY a redacted summary + digests + the report to the KB folder above
forbidden_output_targets (db_write_policy = NONE; uncertainty => no DB write):
- any production row: tac_logical_unit, cutter_governance.* (incl.
source_document_registry, source_document_version_registry, manifest_envelope,
cut_change_set, cut_change_set_affected_row, verify_result,
canonical_address_alias, dot_pair_signature, decision_backlog_*)
- any DR/isolated governance table (no approved isolated dry-run IU table exists)
- source_document / source_document_version mutation
- Directus / Qdrant-vector / NoSQL / git
db_backed_dryrun_table: OUT OF SCOPE. Would require a SEPARATE design + GPT approval
+ explicit rollback plan BEFORE proposal. This package forbids DB writes outright.
5. Reconstruction property (verifiable; feeds doc 4 V-13)
reconstruction_contract:
concatenating every candidate unit's normalized_text in canonical document order
MUST reproduce the enacted-only subset of the snapshot region exactly; the excluded
records (📋 Điều 44, 📝 Điều 34, ⛔ obsolete ×2) + non-content records account for
the remainder, so:
enacted_text ⊕ excluded_text ⊕ noncontent_text == full snapshot region
with no gap and no overlap. This makes "no silent drop" mechanically checkable.
6. Statement
- QG5 satisfied: manifest is artifact-only; leaf/segmentation floor = DIEU; candidates
are effective_status=enacted only; excluded nodes itemised with reason; canonical
address is
ICX-CONST/<path>with status excluded; source_span + heading + title + normalized_text + provenance specified; reconstruction property defined. - Allowed-output folder + 5 file classes defined; ALL DB writes forbidden.
- Nothing executed; no manifest produced — this is the emission schema only.
- doc 3 of 5; STOP after 5 docs → route GPT/User. Self-advance PROHIBITED.
Companion docs: operational-framing (1), matcher-and-status-design (2), command-and-verification-plan (4), entrypoint-design-report (5).