KB-5E6F

dot-iu-cutter v0.5 — Constitution Snapshot-source MARK: Manifest Contract + Allowed Outputs (leaf=DIEU; enacted-only candidates; excluded itemised; artifact-only)

12 min read Revision 1
dot-iu-cutterv0.5constitution-fixturesnapshot-source-markmanifest-contractcanonical-addressallowed-outputsartifact-onlydesign-onlydieu442026-05-18

dot-iu-cutter v0.5 — Constitution Snapshot-source MARK: Manifest Contract + Allowed Outputs

Phase: v0_5_constitution_snapshot_source_MARK_dryrun_entrypoint_design · Nature: manifest_schema_and_output_contract_design_only__no_DB_write · Date: 2026-05-18 · doc 3 of 5

nothing_executed: true ; no manifest produced ; this is the SCHEMA the future
  authorized run must emit, not a manifest
leaf_floor: DIEU (OD-G2) ; scope: enacted_only ; output: ARTIFACT-ONLY (no DB write)
decision_authority: GPT / User ONLY ; self_advance: PROHIBITED

1. Manifest object model (QG5 — artifact-only)

The entrypoint emits ONE manifest = an ordered list of unit records plus a header. A unit is either a candidate (effective_status=enacted, in scope) or an excluded record (controlled_draft / draft / obsolete) or a non-content classification (boilerplate / container heading). Every byte of the snapshot region belongs to exactly one record (coverage closure, doc 2 §3.3).

1.1 Manifest header

manifest_header:
  generated_for:                 incomex-constitution
  source_document_version_id:    icxconst-008a06ace23a96ea6cd456146e805c97
  snapshot_artifact_path:        knowledge/dev/laws/dieu44-trien-khai/snapshots/constitution/constitution-normalized-17660443e0f23e99.md
  snapshot_region_sha256:        17660443e0f23e994e1807cf8e22920951a9e70c598956dbd0e752f4f5cae80c
  snapshot_region_length:        17522
  marker_census_observed:        { enacted:19, controlled_draft:1, draft:1, obsolete:1 }
  grammar_profile:               incomex-architecture-constitution-v4
  address_template:              at.icx.const.v4
  parser_reference_implementation: nuxt-incomex-portal-constitution-v1.refimpl.r1
  docprefix:                     ICX-CONST
  scope_policy:                  enacted_only
  mode:                          mark-manifest-only
  db_write:                      NONE
  candidate_count:               <integer, MUST fall in [55,78] guardrail — doc 4 V-11>
  excluded_count:                <integer>
  noncontent_count:              <integer>
  manifest_digest_sha256:        <hash of the canonicalized unit list — determinism>

1.2 Candidate unit record (effective_status=enacted ONLY)

candidate_unit:
  manifest_id:            stable within-run id (e.g. CU-0001), assignment deterministic
                          by document order
  level:                  NGUYEN_TAC | KIEN_TRUC_SECTION | DIEU   (3 ratified levels)
  unit_kind:              principle | architecture_section | dieu_law | pointer_row
  number:                 arabic/letter/compound id token (e.g. 12 | A | 0-S/M/L | 33)
                          or "—" for the terminology pointer row ; null for unnumbered
  canonical_address:      "ICX-CONST/<path>"  (§2 — status NEVER encoded)
  heading:                the heading/label line as it appears in the snapshot
  title:                  the unit's human title (principle name / section title / Tên)
  parent:                 canonical_address of the container (or null for top-level)
  normalized_text:        the exact normalized substring of the unit body (verbatim
                          from the snapshot region; no re-rendering)
  source_span:            { byte_start, byte_end, char_start, char_end,
                            span_sha256 = sha256(exact substring) }
  status_marker_observed: ✅ | (inherited:✅ via tier_0/tier_1) | none
  effective_status:       enacted   (candidates are enacted by definition)
  status_basis:           tier_0_document_promulgation | tier_1_group_header
                          | tier_2_explicit_row_marker   (doc 2 §3.1 — auditable)
  cut_reason:             heading_boundary | independent_principle
                          | independent_section | catalog_law_entry | pointer_reference
  provenance:             { source_document_version_id, snapshot_artifact_path,
                            snapshot_region_sha256, parser_reference_implementation,
                            grammar_profile }
  confidence:             high | lower(pointer_row / container-vs-leaf REVIEW flag)

1.3 Excluded record (controlled_draft / draft / obsolete — itemised, never dropped)

excluded_unit:
  level:                  DIEU (or group container)
  number / heading / title / source_span / span_sha256   (same span discipline)
  status_marker_observed: 📋 | 📝 | ⛔  (or inherited group marker)
  effective_status:       controlled_draft | draft | obsolete
  status_basis:           tier_1_group_header | tier_2_explicit_row_marker
  exclusion_reason:       controlled_draft_deferred (Điều 44)
                          | draft_excluded_by_enacted_only (Điều 34)
                          | obsolete_excluded (Luật Luồng DL v1.1 ; Hiến pháp v3.9)
  emitted_as:             EXCLUDED  (appears in manifest; not a candidate; not silently dropped)

1.4 Non-content classification record

noncontent_unit:
  body_source_policy:     EXCLUDED_BOILERPLATE | CONTAINER_HEADING
  zone:                   Z1 | Z2-colhdr | Z3-lesson | Z4 | Z5-colhdr | Z5-group-hdr | Z6
  source_span / span_sha256
  examples_from_snapshot:
    EXCLUDED_BOILERPLATE : H1 title, "Văn bản tối cao…", "Ban hành: S148…",
                           "v4.6.3 BAN HÀNH. Giữ nguyên…", Z4 "TUYÊN NGÔN | … | HẠ TẦNG"
                           & "THUẬT NGỮ" pointers, the two "→ " NGUYEN_TAC pointers,
                           the entire CHANGELOG block + final "HP v4.6.3 BAN HÀNH | …"
    CONTAINER_HEADING    : zone-entry headers (Z2/Z3/Z5/Z6), catalog group headers
                           ("Nền tảng — ✅" …), column-header lines
                           (#/Nguyên tắc/Nghĩa/Hệ quả ; Điều/Tên/File ; Version/Nội dung),
                           the 4-DB table scaffold inside section A
  note: container headings may parent candidate units; they are themselves NOT IUs.

2. canonical_address format (BR-A1; status excluded from address)

scheme (locked, address_template at.icx.const.v4):
  "<DOCPREFIX>/<path>"  with docprefix_separator='/'  level_separator='-'
  encodes_status = false  (status marker NEVER enters the address — metadata only)
docprefix: ICX-CONST  (UNIQUE in source_document_registry; cannot collide with
  legacy D38-DIEU28/32/35 addresses)
path_rules (deterministic):
  NGUYEN_TAC #n            -> ICX-CONST/NT-<n>          e.g. ICX-CONST/NT-12
  KIEN_TRUC_SECTION <X>    -> ICX-CONST/KT-<X>          e.g. ICX-CONST/KT-A
  DIEU <id>                -> ICX-CONST/DIEU-<id>       e.g. ICX-CONST/DIEU-33
  DIEU compound id         -> ICX-CONST/DIEU-0-B , ICX-CONST/DIEU-0-S-M-L
                              (the '/' in "0-S/M/L" is normalized to '-' for the
                              address; the original id token is preserved in `number`)
  terminology pointer "—"  -> ICX-CONST/DIEU-TERMINOLOGY  (deterministic synthetic
                              slug for the unnumbered pointer row; flagged confidence=lower)
illustrative_addresses (NOT generated/written — design only):
  Nguyên tắc 12              -> ICX-CONST/NT-12
  Kiến trúc section A        -> ICX-CONST/KT-A
  Điều 33 (PostgreSQL)       -> ICX-CONST/DIEU-33   (status ✅ NOT in address)
  Điều 44 (controlled_draft) -> ICX-CONST/DIEU-44   (EXCLUDED; address still well-formed,
                                                     status 📋 is metadata only)
uniqueness: every emitted unit (candidate AND excluded) MUST have a unique
  canonical_address; a duplicate => BLOCKED address-collision (doc 2 §5).

3. Determinism contract

deterministic_inputs: (pinned snapshot region bytes) + (grammar_profile
  incomex-architecture-constitution-v4) + (refimpl.r1 semantics) + (4-marker map)
determinism_requirement:
  - manifest_id assignment = pure function of document order
  - unit ordering = document order of source_span.byte_start
  - manifest_digest_sha256 = sha256 of the canonicalized (sorted, normalized) unit
    list; re-running on the SAME inputs MUST yield a byte-identical digest
  - NO timestamps, NO RNG, NO environment-dependent value inside the hashed manifest
    body (capture/run timestamps live OUTSIDE the digested region, in the report)

4. Allowed outputs (QG5; task §6)

primary_output_folder (KB):
  knowledge/dev/laws/dieu44-trien-khai/v0.5-constitution-first-dryrun/
allowed_files (exactly these five artifact classes, produced ONLY by a FUTURE
  separately-authorized run — NOT produced in this design phase):
  - …-manifest-2026-..md|.json        # the candidate+excluded+noncontent manifest
  - …-review-evaluation-2026-..md     # coverage / no-overlap / vocab / hierarchy eval
  - …-coverage-proof-2026-..md        # span reconciliation over the snapshot region
  - …-determinism-digest-2026-..md    # manifest_digest_sha256 + re-run equality proof
  - …-dryrun-report-2026-..md         # PASS / FAIL / BLOCKED + one-line operator view
large_output_handling:
  if the manifest is too large for KB, the future run writes it to ephemeral
  throwaway scratch ($WD/manifest, mode 0700, never git-added, shredded at teardown)
  and uploads ONLY a redacted summary + digests + the report to the KB folder above
forbidden_output_targets (db_write_policy = NONE; uncertainty => no DB write):
  - any production row: tac_logical_unit, cutter_governance.* (incl.
    source_document_registry, source_document_version_registry, manifest_envelope,
    cut_change_set, cut_change_set_affected_row, verify_result,
    canonical_address_alias, dot_pair_signature, decision_backlog_*)
  - any DR/isolated governance table (no approved isolated dry-run IU table exists)
  - source_document / source_document_version mutation
  - Directus / Qdrant-vector / NoSQL / git
db_backed_dryrun_table: OUT OF SCOPE. Would require a SEPARATE design + GPT approval
  + explicit rollback plan BEFORE proposal. This package forbids DB writes outright.

5. Reconstruction property (verifiable; feeds doc 4 V-13)

reconstruction_contract:
  concatenating every candidate unit's normalized_text in canonical document order
  MUST reproduce the enacted-only subset of the snapshot region exactly; the excluded
  records (📋 Điều 44, 📝 Điều 34, ⛔ obsolete ×2) + non-content records account for
  the remainder, so:
    enacted_text  ⊕  excluded_text  ⊕  noncontent_text  ==  full snapshot region
  with no gap and no overlap. This makes "no silent drop" mechanically checkable.

6. Statement

  • QG5 satisfied: manifest is artifact-only; leaf/segmentation floor = DIEU; candidates are effective_status=enacted only; excluded nodes itemised with reason; canonical address is ICX-CONST/<path> with status excluded; source_span + heading + title + normalized_text + provenance specified; reconstruction property defined.
  • Allowed-output folder + 5 file classes defined; ALL DB writes forbidden.
  • Nothing executed; no manifest produced — this is the emission schema only.
  • doc 3 of 5; STOP after 5 docs → route GPT/User. Self-advance PROHIBITED.

Companion docs: operational-framing (1), matcher-and-status-design (2), command-and-verification-plan (4), entrypoint-design-report (5).

Back to Knowledge Hub knowledge/dev/laws/dieu44-trien-khai/v0.5-constitution-snapshot-source-mark-dryrun-entrypoint-design/dot-iu-cutter-v0.5-constitution-snapshot-mark-manifest-contract-2026-05-18.md