KB-37A8

dot-iu-cutter v0.1 — P0-2 manifest_envelope + manifest_unit_block Migration Design

17 min read Revision 1
dot-iu-cuttermigration-designp0-2manifestchild-rows-jsonb-hybridno-ddlrev5d

dot-iu-cutter v0.1 — P0-2 manifest_envelope + manifest_unit_block Migration Design

Date: 2026-05-15 Status: P0 MIGRATION DESIGN — Item 3 of 6 (joint per Đ44 Step 1) Scope: DESIGN ONLY. No DDL, no SQL, no CREATE/ALTER TABLE, no column DDL, no migration execution, no PG mutation. Master: migration-design/dot-iu-cutter-v0.1-p0-migration-design-master-2026-05-15.md


1. Purpose

P0-2 establishes the persistent PG SSOT for the cutter manifest, which is the central artifact connecting MARK → REVIEW → CUT → VERIFY → REPORT. The manifest carries the segmentation decision in versioned, diffable, reviewable form (Đ38 manifest-as-code). Per Decision 5 (child rows + JSONB hybrid), the manifest is materialized as:

  • manifest_envelope (one row per manifest version)
  • manifest_unit_block (one row per IU per manifest version; composite identity)

2. Source Design References

  • D2 Manifest and Operator Contract — §4.1 (manifest-as-code), §4.2 (full field set), §4.3 (boundary rules), §4.4 (vocabulary), §4.5 (body source policy), §4.6 (review contract), §4.8 (risk gating), §4.9 (persistence), §4.11 (diffability), §6 (schema gaps).
  • D1 Operational Design — §4.4 (MARK), §4.5 (REVIEW), §4.6 (CUT).
  • D6 Assembly Axes — §4.2 (axis-1 metadata), §4.3 (axis-2 metadata).
  • User Decision Confirmation §4.5 — Decision 5 (Child rows + JSONB hybrid).
  • Đ44 Step 1 outcome — manifest_envelope + manifest_unit_block ratified_with_notes (composite identity accepted).
  • Đ24 Step 1 outcome — section_type/unit_kind/body_source_policy/collision_status/risk_class ratified.
  • P0 Schema Planning §5.2 P0-2 detail.

3. Logical Object / Table Intent

Primary tables:

  • manifest_envelope — header row per manifest version
  • manifest_unit_block — per-IU row, FK to envelope, composite identity (envelope_id + unit_local_id)

Target DB: directus. Target Schema: TAC schema (existing) OR new schema class for manifest family — open decision §9 item 1. Target Layer: Kho (SSOT for manifest decisions). Authority pattern: PG SSOT; KB markdown export = mirror, NEVER authority.

4. Proposed Fields at Conceptual Level

4.1 manifest_envelope

Field name Type-class Nullable Notes
manifest_id uuid OR text (deterministic) NO primary identifier
manifest_version text (semver) NO semver per Đ38 versioning
source_path text NO KB path or TAC source identifier
source_revision text NO exact revision marker at MARK time
source_kind enum-ref to Đ24 NO values: law / design / requirements / code / report / runbook / ...
tool_revision text NO cutter revision emitting this manifest
emitted_by text actor NO system/agent/user identifier
emitted_at timestamp UTC NO creation time
c1a_rule_refs JSONB YES list of applicable SR/OD-PILOT/NL/CI references at manifest level
body_source_policy_default enum-ref to Đ24 (inline/container/referenced/generated) YES manifest-level default override per unit
collision_status enum-ref to Đ24 (none/prior_cut_present/supersedes) NO per D1 §4.3
risk_class enum-ref to Đ24/Đ32 (low/standard/high) NO risk gating per D2 §4.8
review_required boolean NO sticky once set per D2 §4.8
state enum-ref NO values: draft / under_review / pass / fail / needs_human / cut_in_progress / cut_complete / rolled_back
superseded_by_manifest_id FK to manifest_envelope YES for re-MARK chains
prior_manifest_id FK to manifest_envelope YES for versioning diffs
report_summary JSONB YES populated at REPORT stage; structured summary
decision_backlog_root_entry_id FK to decision_backlog_entry (P0-5) YES governance trail anchor

4.2 manifest_unit_block

Composite identity: (manifest_id, unit_local_id).

Field name Type-class Nullable Notes
manifest_id FK to manifest_envelope NO composite identity part 1
unit_local_id text NO within-manifest unit identifier; composite identity part 2
block_order integer NO render_order within manifest
canonical_address text YES references tac_logical_unit.canonical_address (P0-1) when unit is materialized; can be pre-populated at MARK
tac_logical_unit_id FK to tac_logical_unit YES null until CUT; set on CUT
source_span_start integer OR JSONB (byte/line) NO open decision §9 item 5 (unit form)
source_span_end integer OR JSONB NO same as above
title text NO C1A "clear title" requirement
section_type enum-ref to Đ24 NO from controlled vocab
unit_kind enum-ref to Đ24 NO from controlled vocab
parent_unit_local_id text YES composite-aware parent reference (within-manifest)
parent_tac_logical_unit_id FK to tac_logical_unit YES cross-manifest parent if exists
hierarchy_depth integer NO depth in parent chain
body_source_policy enum-ref to Đ24 NO inline / container / referenced / generated
c1a_rule_refs_unit JSONB YES per-unit C1A rule references
three_question_test_result JSONB NO structured: {clear_title: bool+note, independently_editable: bool+note, not_too_hard_to_edit: bool+note}
cut_reason text YES free text + structured tag set per D2 §4.2
cut_reason_tags JSONB (array of tag strings) YES structured tag set
confidence numeric OR enum-ref (banded) YES open decision §9 item 6
length_flag enum-ref NO values: within_band / over / under
edge_readiness_notes JSONB YES axis-2 hooks per D2 §4.2, D6 §4.5
candidate_edges JSONB YES pre-marked edges to existing units; references universal_edges first per Đ39
review_flags JSONB (array) YES explicit reviewer-attention items
semantic_role enum-ref to Đ24 (P3-vocab placeholder) YES conceptual function within domain
birth_gate_class enum-ref to Đ0-G YES birth gate readiness flag
authority enum-ref to Đ24 group 10 YES inherited from source or explicitly set
block_state enum-ref NO values: marked / reviewed_pass / reviewed_fail / reviewed_needs_human / cut / superseded

5. Field Ownership / Vocabulary Dependency

Field family Vocabulary owner
source_kind, section_type, unit_kind, body_source_policy, collision_status, risk_class Đ24 Step 1 ratified
authority Đ24 Step 1 ratified (cross-law with Đ0-G)
state (envelope), block_state (block) cutter-local v0.1; recommend Đ24 confirm path
length_flag enum cutter-local v0.1
cut_reason_tags cutter-local + Đ24 ratification for high-frequency tags
semantic_role Đ24 Step 3 (P3-deferred) — placeholder values acceptable per P0 master §9
birth_gate_class Đ0-G
c1a_rule_refs* references C1A authority (Đ38-trien-khai) — references only; no new vocab
candidate_edges payload Đ39 universal_edges-first; payload shape is JSONB-flexible per Đ44 A.6 #3

6. Child Rows + JSONB Hybrid Pattern

Per Decision 5 (ratified at Council Ratification Outcome §5.5):

hybrid_pattern:
  child_rows (first-class columns on manifest_unit_block):
    - manifest_id, unit_local_id (composite identity)
    - block_order, canonical_address, tac_logical_unit_id
    - source_span_start, source_span_end
    - title, section_type, unit_kind, parent_unit_local_id, parent_tac_logical_unit_id, hierarchy_depth, body_source_policy
    - length_flag, confidence, semantic_role, birth_gate_class, authority, block_state
    rationale: queryable; index-friendly; diff-clean; aligns with axis-1/axis-2 metadata contract (D6)
  jsonb_columns (flex / vocab-churn):
    - c1a_rule_refs_unit
    - three_question_test_result
    - cut_reason_tags
    - edge_readiness_notes
    - candidate_edges
    - review_flags
    rationale: shape may evolve; PROV/PROV-style payload; D2 §4.2 explicitly lists these as structured-flexible

JSONB acceptance basis: Đ44 outcome A.6 #3 — JSONB IS acceptable for G5/G6 group fields in v0.1.

7. Composite Identity Pattern (Đ44 outcome A.2 note 1)

composite_identity:
  pattern: (manifest_id, unit_local_id)
  uniqueness: within a manifest_envelope, unit_local_id is unique
  cross_manifest_uniqueness: NO (different manifest versions may have same unit_local_id values for unchanged units)
  rationale:
    - allows diff between manifest versions to track unit-by-unit changes
    - keeps unit_local_id stable across MARK iterations of the same source
    - canonical_address (P0-1) is the cross-manifest stable identity
relationship_to_tac_logical_unit:
  pre_cut: manifest_unit_block exists; tac_logical_unit_id is NULL
  post_cut: tac_logical_unit_id populated; the block now references the materialized unit
  rollback: tac_logical_unit_id may be unset / supersession recorded

8. Lifecycle

[Envelope lifecycle]
draft → under_review → (pass | fail | needs_human)
   ↓
pass → cut_in_progress → cut_complete (terminal unless rolled back)
   ↓                           ↓
fail → terminal              rolled_back (via P0-3 rollback_key)
needs_human → escalated → (resolved to pass/fail/superseded)

[Block lifecycle]
marked → reviewed_pass | reviewed_fail | reviewed_needs_human
   ↓
reviewed_pass → cut (when envelope cut_complete)
   ↓
cut → (terminal | superseded by future MARK)

State transitions are logged via Đ38 versioning (envelope manifest_version bumps on substantive changes).

9. Open Decisions

  1. Schema placement — TAC schema (existing) OR new schema class for manifest family (e.g. manifest). Recommendation: new schema class to avoid TAC pollution; Đ44 + Đ33/Đ43 confirm.
  2. Composite identity enforcement — PG composite primary key OR app-level. Recommendation: PG composite PK for (manifest_id, unit_local_id). (Decision affects FK design of dependents.) Constraint DDL is FUTURE.
  3. source_revision shape — git commit SHA / KB doc revision number / timestamp / hybrid. Recommendation: text accepting any of these; format-version aware.
  4. source_span unit — byte offsets / line numbers / AST nodes / canonical text span. Cross-link with P0-4 §9 axis-1 drift count unit (must be consistent). Recommendation: byte offsets primary, with optional line-number companion field for human-friendly review. Final decision deferred until P0-4 axis-1 drift unit is decided in parallel.
  5. source_span_start/end as integers OR JSONB — integers if byte-offset only; JSONB if hybrid (byte + line). Recommendation: JSONB to future-proof; query-perf cost mitigated by indexes on byte offsets if needed.
  6. confidence shape — numeric (0–1) OR banded enum (low/medium/high/very_high). Recommendation: numeric for richer signals; banded enum derivable. Đ44 A.6 #5 first-class column policy.
  7. block_state and state enums — cutter-local OR Đ24-ratified. Recommendation: Đ24 confirmation path; cutter-local v0.1 with vocabulary-gap routing to Đ24 if extensions emerge.
  8. cut_reason_tags JSONB shape — free tag array OR structured {tag_id, tag_label}. Recommendation: free array v0.1; structured at Đ24 ratification of high-frequency tags.
  9. Manifest diff materialization — view OR materialized view OR computed at query. Recommendation: view v0.1; materialized view FUTURE optimization. Diff is consumed by REVIEW + audit (D2 §4.11).
  10. report_summary JSONB shape — fully open per Đ44 A.6 #3. Final shape pending REPORT design (touched in P0-4 verify_result).
  11. decision_backlog_root_entry_id — should manifest_envelope have a single root backlog entry, OR multiple per-event backlog entries? Recommendation: single root entry per manifest with FK chain via dependency edges to specific events.

10. Dependencies

upstream_dependencies:
  governance:
    - Council Ratification Outcome G-1, G-2, G-3, G-5 ratified_with_notes
    - Đ44 Step 1 manifest_envelope + manifest_unit_block families ratified_with_notes
    - Đ24 Step 1 vocabulary (section_type/unit_kind/body_source_policy/collision_status/risk_class/authority) ratified
  schema:
    - P0-5 decision_backlog_entry exists (governance trail anchor FK)
    - P0-1 canonical_address available on tac_logical_unit (block FK + identity reference)
  no_data_dependency_on_p0_3_p0_4_p0_6_in_design_phase: true
downstream_dependents:
  - P0-6 review_decision (FK to manifest_envelope + manifest_unit_block; review verdict per block)
  - P0-3 cut_change_set (FK to manifest_envelope; carries change-set generated from PASSed manifest)
  - P0-4 verify_result (FK to manifest_envelope + cut_change_set)
  - F2 Health/Correction (D3): operates on cut units; manifest is the decision audit trail
operational_dependencies:
  - MARK stage implementation that writes manifest_envelope + manifest_unit_block rows (FUTURE)
  - REVIEW stage that mutates state per outcome (FUTURE)
  - CUT stage that materializes tac_logical_unit_id (FUTURE)

11. Risks

Risk Severity Mitigation in this design
Composite identity FK propagation complexity Standard composite-FK pattern is PG-supported; downstream items (P0-3, P0-4, P0-6) carry (manifest_id, unit_local_id) where needed
source_span unit decision drift between MARK and VERIFY Standard cross-link with P0-4 §9 item 1; decide jointly
JSONB schema drift across cutter versions Standard manifest_version semver + tool_revision tracks schema epoch; v0.1 application-layer validation
Vocabulary gap on section_type / unit_kind Standard vocabulary_gap routing to G-2 backlog (Đ24 group 11 ratified)
Manifest rollback orphaning blocks Standard rollback via P0-3 change-set; blocks remain queryable; tac_logical_unit_id unset on rollback
Diff cost on large manifests Low (v0.1) diff view; materialized view FUTURE optimization
review_required = true becoming sticky and over-escalating Standard per D2 §4.8 stickiness is intentional; D4 capability intake tunes
Backfill of existing manifests (if any) Low v0.1 cutter does not have prior manifests in PG; bootstrap is clean
report_summary JSONB size unbounded Standard size cap policy FUTURE; v0.1 application-level discipline
Cross-manifest unit_local_id collision misinterpreted Standard composite identity is per-manifest; documentation discipline

12. Đ32 Risk Review Notes

proposed_risk_class: Standard
review_inputs_for_dieu32:
  - logical design content (this document)
  - hybrid pattern justification (§6)
  - composite identity pattern (§7)
  - source_span unit decision dependency on P0-4
  - cross-law vocabulary dependencies (Đ24 Step 1 ratified)
  - migration execution preconditions:
    - P0-5 decision_backlog_entry migrated first (this design treats P0-5 as upstream)
    - P0-1 canonical_address migrated and backfilled to a usable state
    - schema class placement decided (open §9 item 1)
    - composite-PK strategy decided (open §9 item 2)
    - source_span unit decided (open §9 item 4 — joint with P0-4)
    - backup of directus before migration
review_outputs_expected:
  - Đ32 approval / approval_with_notes
  - Đ44 final confirmation on composite identity at operational ratification
  - Đ24 final confirmation on `state` / `block_state` if elevated from cutter-local
  - migration execution preconditions confirmed
review_authority: Đ32 council + Đ44 family registry custodian + Đ24 vocabulary owner co-sign
review_phase: NOT_STARTED

Special Đ32 attention:

  • Vocabulary leakagesection_type / unit_kind enums must be FK to Đ24 lookup (or PG enum with strict check); silent text values are vocabulary-discipline violation per criterion 39.
  • source_span correctness — wrong span unit → round-trip drift → P0-4 VERIFY fails systemically. Critical Đ32 attention.
  • review_required stickiness — once HIGH risk_class is set, review_required must stay true even on re-MARK; Đ32 confirm at audit.

13. Explicit Confirmation

no_ddl_written: true
no_sql_written: true
no_create_table_or_alter_table_in_this_document: true
no_column_ddl_in_this_document: true
no_index_ddl: true
no_constraint_ddl_in_this_document: true
no_composite_pk_ddl_in_this_document: true
no_migration_executed: true
no_pg_mutation: true
no_qdrant_mutation: true
no_data_writes: true
no_implementation_planning: true
no_existing_file_modified: true
output_form: logical_design_only
Back to Knowledge Hub knowledge/dev/laws/dieu44-trien-khai/migration-design/dot-iu-cutter-v0.1-p0-2-manifest-envelope-unit-block-migration-design-2026-05-15.md