dot-iu-cutter v0.1 — P0-2 manifest_envelope + manifest_unit_block Migration Design
dot-iu-cutter v0.1 — P0-2 manifest_envelope + manifest_unit_block Migration Design
Date: 2026-05-15 Status: P0 MIGRATION DESIGN — Item 3 of 6 (joint per Đ44 Step 1) Scope: DESIGN ONLY. No DDL, no SQL, no CREATE/ALTER TABLE, no column DDL, no migration execution, no PG mutation. Master:
migration-design/dot-iu-cutter-v0.1-p0-migration-design-master-2026-05-15.md
1. Purpose
P0-2 establishes the persistent PG SSOT for the cutter manifest, which is the central artifact connecting MARK → REVIEW → CUT → VERIFY → REPORT. The manifest carries the segmentation decision in versioned, diffable, reviewable form (Đ38 manifest-as-code). Per Decision 5 (child rows + JSONB hybrid), the manifest is materialized as:
manifest_envelope(one row per manifest version)manifest_unit_block(one row per IU per manifest version; composite identity)
2. Source Design References
- D2 Manifest and Operator Contract — §4.1 (manifest-as-code), §4.2 (full field set), §4.3 (boundary rules), §4.4 (vocabulary), §4.5 (body source policy), §4.6 (review contract), §4.8 (risk gating), §4.9 (persistence), §4.11 (diffability), §6 (schema gaps).
- D1 Operational Design — §4.4 (MARK), §4.5 (REVIEW), §4.6 (CUT).
- D6 Assembly Axes — §4.2 (axis-1 metadata), §4.3 (axis-2 metadata).
- User Decision Confirmation §4.5 — Decision 5 (Child rows + JSONB hybrid).
- Đ44 Step 1 outcome — manifest_envelope + manifest_unit_block ratified_with_notes (composite identity accepted).
- Đ24 Step 1 outcome — section_type/unit_kind/body_source_policy/collision_status/risk_class ratified.
- P0 Schema Planning §5.2 P0-2 detail.
3. Logical Object / Table Intent
Primary tables:
manifest_envelope— header row per manifest versionmanifest_unit_block— per-IU row, FK to envelope, composite identity (envelope_id + unit_local_id)
Target DB: directus. Target Schema: TAC schema (existing) OR new schema class for manifest family — open decision §9 item 1. Target Layer: Kho (SSOT for manifest decisions). Authority pattern: PG SSOT; KB markdown export = mirror, NEVER authority.
4. Proposed Fields at Conceptual Level
4.1 manifest_envelope
| Field name | Type-class | Nullable | Notes |
|---|---|---|---|
manifest_id |
uuid OR text (deterministic) | NO | primary identifier |
manifest_version |
text (semver) | NO | semver per Đ38 versioning |
source_path |
text | NO | KB path or TAC source identifier |
source_revision |
text | NO | exact revision marker at MARK time |
source_kind |
enum-ref to Đ24 | NO | values: law / design / requirements / code / report / runbook / ... |
tool_revision |
text | NO | cutter revision emitting this manifest |
emitted_by |
text actor | NO | system/agent/user identifier |
emitted_at |
timestamp UTC | NO | creation time |
c1a_rule_refs |
JSONB | YES | list of applicable SR/OD-PILOT/NL/CI references at manifest level |
body_source_policy_default |
enum-ref to Đ24 (inline/container/referenced/generated) |
YES | manifest-level default override per unit |
collision_status |
enum-ref to Đ24 (none/prior_cut_present/supersedes) |
NO | per D1 §4.3 |
risk_class |
enum-ref to Đ24/Đ32 (low/standard/high) |
NO | risk gating per D2 §4.8 |
review_required |
boolean | NO | sticky once set per D2 §4.8 |
state |
enum-ref | NO | values: draft / under_review / pass / fail / needs_human / cut_in_progress / cut_complete / rolled_back |
superseded_by_manifest_id |
FK to manifest_envelope |
YES | for re-MARK chains |
prior_manifest_id |
FK to manifest_envelope |
YES | for versioning diffs |
report_summary |
JSONB | YES | populated at REPORT stage; structured summary |
decision_backlog_root_entry_id |
FK to decision_backlog_entry (P0-5) |
YES | governance trail anchor |
4.2 manifest_unit_block
Composite identity: (manifest_id, unit_local_id).
| Field name | Type-class | Nullable | Notes |
|---|---|---|---|
manifest_id |
FK to manifest_envelope |
NO | composite identity part 1 |
unit_local_id |
text | NO | within-manifest unit identifier; composite identity part 2 |
block_order |
integer | NO | render_order within manifest |
canonical_address |
text | YES | references tac_logical_unit.canonical_address (P0-1) when unit is materialized; can be pre-populated at MARK |
tac_logical_unit_id |
FK to tac_logical_unit |
YES | null until CUT; set on CUT |
source_span_start |
integer OR JSONB (byte/line) | NO | open decision §9 item 5 (unit form) |
source_span_end |
integer OR JSONB | NO | same as above |
title |
text | NO | C1A "clear title" requirement |
section_type |
enum-ref to Đ24 | NO | from controlled vocab |
unit_kind |
enum-ref to Đ24 | NO | from controlled vocab |
parent_unit_local_id |
text | YES | composite-aware parent reference (within-manifest) |
parent_tac_logical_unit_id |
FK to tac_logical_unit |
YES | cross-manifest parent if exists |
hierarchy_depth |
integer | NO | depth in parent chain |
body_source_policy |
enum-ref to Đ24 | NO | inline / container / referenced / generated |
c1a_rule_refs_unit |
JSONB | YES | per-unit C1A rule references |
three_question_test_result |
JSONB | NO | structured: {clear_title: bool+note, independently_editable: bool+note, not_too_hard_to_edit: bool+note} |
cut_reason |
text | YES | free text + structured tag set per D2 §4.2 |
cut_reason_tags |
JSONB (array of tag strings) | YES | structured tag set |
confidence |
numeric OR enum-ref (banded) | YES | open decision §9 item 6 |
length_flag |
enum-ref | NO | values: within_band / over / under |
edge_readiness_notes |
JSONB | YES | axis-2 hooks per D2 §4.2, D6 §4.5 |
candidate_edges |
JSONB | YES | pre-marked edges to existing units; references universal_edges first per Đ39 |
review_flags |
JSONB (array) | YES | explicit reviewer-attention items |
semantic_role |
enum-ref to Đ24 (P3-vocab placeholder) | YES | conceptual function within domain |
birth_gate_class |
enum-ref to Đ0-G | YES | birth gate readiness flag |
authority |
enum-ref to Đ24 group 10 | YES | inherited from source or explicitly set |
block_state |
enum-ref | NO | values: marked / reviewed_pass / reviewed_fail / reviewed_needs_human / cut / superseded |
5. Field Ownership / Vocabulary Dependency
| Field family | Vocabulary owner |
|---|---|
source_kind, section_type, unit_kind, body_source_policy, collision_status, risk_class |
Đ24 Step 1 ratified |
authority |
Đ24 Step 1 ratified (cross-law with Đ0-G) |
state (envelope), block_state (block) |
cutter-local v0.1; recommend Đ24 confirm path |
length_flag enum |
cutter-local v0.1 |
cut_reason_tags |
cutter-local + Đ24 ratification for high-frequency tags |
semantic_role |
Đ24 Step 3 (P3-deferred) — placeholder values acceptable per P0 master §9 |
birth_gate_class |
Đ0-G |
c1a_rule_refs* references |
C1A authority (Đ38-trien-khai) — references only; no new vocab |
candidate_edges payload |
Đ39 universal_edges-first; payload shape is JSONB-flexible per Đ44 A.6 #3 |
6. Child Rows + JSONB Hybrid Pattern
Per Decision 5 (ratified at Council Ratification Outcome §5.5):
hybrid_pattern:
child_rows (first-class columns on manifest_unit_block):
- manifest_id, unit_local_id (composite identity)
- block_order, canonical_address, tac_logical_unit_id
- source_span_start, source_span_end
- title, section_type, unit_kind, parent_unit_local_id, parent_tac_logical_unit_id, hierarchy_depth, body_source_policy
- length_flag, confidence, semantic_role, birth_gate_class, authority, block_state
rationale: queryable; index-friendly; diff-clean; aligns with axis-1/axis-2 metadata contract (D6)
jsonb_columns (flex / vocab-churn):
- c1a_rule_refs_unit
- three_question_test_result
- cut_reason_tags
- edge_readiness_notes
- candidate_edges
- review_flags
rationale: shape may evolve; PROV/PROV-style payload; D2 §4.2 explicitly lists these as structured-flexible
JSONB acceptance basis: Đ44 outcome A.6 #3 — JSONB IS acceptable for G5/G6 group fields in v0.1.
7. Composite Identity Pattern (Đ44 outcome A.2 note 1)
composite_identity:
pattern: (manifest_id, unit_local_id)
uniqueness: within a manifest_envelope, unit_local_id is unique
cross_manifest_uniqueness: NO (different manifest versions may have same unit_local_id values for unchanged units)
rationale:
- allows diff between manifest versions to track unit-by-unit changes
- keeps unit_local_id stable across MARK iterations of the same source
- canonical_address (P0-1) is the cross-manifest stable identity
relationship_to_tac_logical_unit:
pre_cut: manifest_unit_block exists; tac_logical_unit_id is NULL
post_cut: tac_logical_unit_id populated; the block now references the materialized unit
rollback: tac_logical_unit_id may be unset / supersession recorded
8. Lifecycle
[Envelope lifecycle]
draft → under_review → (pass | fail | needs_human)
↓
pass → cut_in_progress → cut_complete (terminal unless rolled back)
↓ ↓
fail → terminal rolled_back (via P0-3 rollback_key)
needs_human → escalated → (resolved to pass/fail/superseded)
[Block lifecycle]
marked → reviewed_pass | reviewed_fail | reviewed_needs_human
↓
reviewed_pass → cut (when envelope cut_complete)
↓
cut → (terminal | superseded by future MARK)
State transitions are logged via Đ38 versioning (envelope manifest_version bumps on substantive changes).
9. Open Decisions
- Schema placement — TAC schema (existing) OR new schema class for manifest family (e.g.
manifest). Recommendation: new schema class to avoid TAC pollution; Đ44 + Đ33/Đ43 confirm. - Composite identity enforcement — PG composite primary key OR app-level. Recommendation: PG composite PK for
(manifest_id, unit_local_id). (Decision affects FK design of dependents.) Constraint DDL is FUTURE. source_revisionshape — git commit SHA / KB doc revision number / timestamp / hybrid. Recommendation: text accepting any of these; format-version aware.source_spanunit — byte offsets / line numbers / AST nodes / canonical text span. Cross-link with P0-4 §9 axis-1 drift count unit (must be consistent). Recommendation: byte offsets primary, with optional line-number companion field for human-friendly review. Final decision deferred until P0-4 axis-1 drift unit is decided in parallel.source_span_start/endas integers OR JSONB — integers if byte-offset only; JSONB if hybrid (byte + line). Recommendation: JSONB to future-proof; query-perf cost mitigated by indexes on byte offsets if needed.confidenceshape — numeric (0–1) OR banded enum (low/medium/high/very_high). Recommendation: numeric for richer signals; banded enum derivable. Đ44 A.6 #5 first-class column policy.block_stateandstateenums — cutter-local OR Đ24-ratified. Recommendation: Đ24 confirmation path; cutter-local v0.1 with vocabulary-gap routing to Đ24 if extensions emerge.cut_reason_tagsJSONB shape — free tag array OR structured{tag_id, tag_label}. Recommendation: free array v0.1; structured at Đ24 ratification of high-frequency tags.- Manifest diff materialization — view OR materialized view OR computed at query. Recommendation: view v0.1; materialized view FUTURE optimization. Diff is consumed by REVIEW + audit (D2 §4.11).
report_summaryJSONB shape — fully open per Đ44 A.6 #3. Final shape pending REPORT design (touched in P0-4 verify_result).decision_backlog_root_entry_id— should manifest_envelope have a single root backlog entry, OR multiple per-event backlog entries? Recommendation: single root entry per manifest with FK chain via dependency edges to specific events.
10. Dependencies
upstream_dependencies:
governance:
- Council Ratification Outcome G-1, G-2, G-3, G-5 ratified_with_notes
- Đ44 Step 1 manifest_envelope + manifest_unit_block families ratified_with_notes
- Đ24 Step 1 vocabulary (section_type/unit_kind/body_source_policy/collision_status/risk_class/authority) ratified
schema:
- P0-5 decision_backlog_entry exists (governance trail anchor FK)
- P0-1 canonical_address available on tac_logical_unit (block FK + identity reference)
no_data_dependency_on_p0_3_p0_4_p0_6_in_design_phase: true
downstream_dependents:
- P0-6 review_decision (FK to manifest_envelope + manifest_unit_block; review verdict per block)
- P0-3 cut_change_set (FK to manifest_envelope; carries change-set generated from PASSed manifest)
- P0-4 verify_result (FK to manifest_envelope + cut_change_set)
- F2 Health/Correction (D3): operates on cut units; manifest is the decision audit trail
operational_dependencies:
- MARK stage implementation that writes manifest_envelope + manifest_unit_block rows (FUTURE)
- REVIEW stage that mutates state per outcome (FUTURE)
- CUT stage that materializes tac_logical_unit_id (FUTURE)
11. Risks
| Risk | Severity | Mitigation in this design |
|---|---|---|
| Composite identity FK propagation complexity | Standard | composite-FK pattern is PG-supported; downstream items (P0-3, P0-4, P0-6) carry (manifest_id, unit_local_id) where needed |
source_span unit decision drift between MARK and VERIFY |
Standard | cross-link with P0-4 §9 item 1; decide jointly |
| JSONB schema drift across cutter versions | Standard | manifest_version semver + tool_revision tracks schema epoch; v0.1 application-layer validation |
Vocabulary gap on section_type / unit_kind |
Standard | vocabulary_gap routing to G-2 backlog (Đ24 group 11 ratified) |
| Manifest rollback orphaning blocks | Standard | rollback via P0-3 change-set; blocks remain queryable; tac_logical_unit_id unset on rollback |
| Diff cost on large manifests | Low (v0.1) | diff view; materialized view FUTURE optimization |
review_required = true becoming sticky and over-escalating |
Standard | per D2 §4.8 stickiness is intentional; D4 capability intake tunes |
| Backfill of existing manifests (if any) | Low | v0.1 cutter does not have prior manifests in PG; bootstrap is clean |
report_summary JSONB size unbounded |
Standard | size cap policy FUTURE; v0.1 application-level discipline |
Cross-manifest unit_local_id collision misinterpreted |
Standard | composite identity is per-manifest; documentation discipline |
12. Đ32 Risk Review Notes
proposed_risk_class: Standard
review_inputs_for_dieu32:
- logical design content (this document)
- hybrid pattern justification (§6)
- composite identity pattern (§7)
- source_span unit decision dependency on P0-4
- cross-law vocabulary dependencies (Đ24 Step 1 ratified)
- migration execution preconditions:
- P0-5 decision_backlog_entry migrated first (this design treats P0-5 as upstream)
- P0-1 canonical_address migrated and backfilled to a usable state
- schema class placement decided (open §9 item 1)
- composite-PK strategy decided (open §9 item 2)
- source_span unit decided (open §9 item 4 — joint with P0-4)
- backup of directus before migration
review_outputs_expected:
- Đ32 approval / approval_with_notes
- Đ44 final confirmation on composite identity at operational ratification
- Đ24 final confirmation on `state` / `block_state` if elevated from cutter-local
- migration execution preconditions confirmed
review_authority: Đ32 council + Đ44 family registry custodian + Đ24 vocabulary owner co-sign
review_phase: NOT_STARTED
Special Đ32 attention:
- Vocabulary leakage —
section_type/unit_kindenums must be FK to Đ24 lookup (or PG enum with strict check); silent text values are vocabulary-discipline violation per criterion 39. source_spancorrectness — wrong span unit → round-trip drift → P0-4 VERIFY fails systemically. Critical Đ32 attention.review_requiredstickiness — once HIGH risk_class is set, review_required must stay true even on re-MARK; Đ32 confirm at audit.
13. Explicit Confirmation
no_ddl_written: true
no_sql_written: true
no_create_table_or_alter_table_in_this_document: true
no_column_ddl_in_this_document: true
no_index_ddl: true
no_constraint_ddl_in_this_document: true
no_composite_pk_ddl_in_this_document: true
no_migration_executed: true
no_pg_mutation: true
no_qdrant_mutation: true
no_data_writes: true
no_implementation_planning: true
no_existing_file_modified: true
output_form: logical_design_only