KB-6547

dot-iu-cutter v0.2 — BR-6 Split/Merge TD Analysis (2026-05-16)

9 min read Revision 1
dot-iu-cutterdieu44v0.2br-6split-mergetd-analysisplanningpending-gpt-review

dot-iu-cutter v0.2 — BR-6 Split/Merge TD Analysis

document_path: knowledge/dev/laws/dieu44-trien-khai/v0.2-planning/dot-iu-cutter-v0.2-br-6-split-merge-td-analysis-2026-05-16.md
revision: r1
date: 2026-05-16
author: Agent (Claude Code CLI, Opus 4.7 1M)
phase: v0.2 — BR-6 absorption (PLANNING/ANALYSIS ONLY)
mutation_performed: false
ddl_written: false
manifest_design_started: false
production_migration_allowed: false
status: PENDING_GPT_REVIEW (Agent does not self-close BR-6)

§1 — Source TD Summary

Source: knowledge/dev/laws/dieu44-trien-khai/backlog/td-p1-split-merge-metadata-propagation-gap-2026-05-15.md (Status OPEN, Priority P1, Owner GPT G-2 Backlog Custodian + User.)

v0.1 dot-iu-cutter was designed for first cut only (source document → pieces / miếng). It has no automated pipeline for the two post-cut topology operations:

  1. SPLIT — one already-cut piece → 2+ smaller pieces.
  2. MERGE — 2+ pieces → 1 larger piece.

The TD's core assertion: split/merge is not a text operation — it is a metadata propagation operation. Splitting or merging the text without propagating every dependent metadata facet corrupts addressing, identity, edges, and audit history.

1.1 What the TD enumerates must propagate

On SPLIT:

  • canonical_address: original → 2+ new addresses
  • birth_registry: origin piece marked superseded; new pieces born (khai sinh)
  • universal_edges: re-distributed (which edge belongs to which child)
  • section_type / unit_kind: may change for sub-pieces
  • render_order: recomputed
  • publication_member: updated
  • lifecycle_log: "split event" recorded
  • identity_profile: hierarchy changes

On MERGE:

  • 2+ canonical_address → 1 new address
  • internal edges between merged pieces → deleted
  • external edges → merged, de-duplicated
  • metadata → synthesized rewrite (section_type / unit_kind of the merged piece)
  • birth_registry: old superseded, new born
  • lifecycle_log: "merge event" recorded

1.2 Detection vs handling (the actual gap)

Detection signals are already designed (Q21/Q22): co-citation, co-edit, co-retrieval, edge_density_overlap auto-flag coupling; periodic Segmentation Health Report. What does not exist is the pipeline that runs AFTER a piece is flagged — i.e., the actual split/merge metadata-propagation execution path. BR-6 is therefore a handling-gap, not a detection-gap.

1.3 TD decision of record

Recorded as a P1 capability: design the SPLIT/MERGE pipeline after v0.1 first-cut is stable. When P0-5 is live, convert this TD into a formal decision_backlog_entry in PG. Linked to Q16 (merge), Q21/Q22 (detection), D3 §4.2 (health signal catalog), and Nguyên tắc 8: "Cắt không phải quyết định cuối cùng" (a cut is not a final decision).


§2 — What "split/merge metadata propagation" precisely means

It is the requirement that any topology change to an already-cut unit be expressed as a governed, atomic, auditable transition that rewrites the full dependent-metadata closure of the affected units, not merely their text. Concretely it has four properties:

  • Closure: the set of facets that must move is closed — every facet listed in §1.1 must be addressed, or the operation is incomplete by definition.
  • Atomicity: a split/merge is one transition; partial propagation is a corruption state, not an intermediate state.
  • Auditability: predecessor identity is never destroyed — it is superseded with a birth/lifecycle/alias trail forward to successors (Nguyên tắc 8).
  • Governance: a split/merge is a proposal that is reviewed before it is enacted — it is a CUT-class operation, not a free edit.

P0-2's manifest_envelope / manifest_unit_block are the proposal/grouping surface where a split/merge is expressed and reviewed before any execution. They are therefore the first schema that BR-6 constrains — which is exactly why the closeout GPT review and the scope backlog (§5) require BR-6 to be absorbed before P0-2 DDL.


§3 — Affected Metadata Fields — Classification

Buckets: P0-2 = the manifest schema must structurally carry it now; P1 = execution/propagation logic deferred to the P1 split/merge pipeline; INVARIANT = must be written as a binding rule carried into P0-2 and never violated; GOV = needs an explicit governance decision before P0-2 DDL is frozen.

# Field / facet Classification Reasoning
1 canonical_address INVARIANT + GOV SSOT, UNIQUE/NOT NULL/immutable. Split/merge MUST coin new addresses and emit alias rows; predecessor canonical is never mutated or reused (alias-design §6). INVARIANT: manifest must never rewrite a live canonical_address in place. GOV: the address-coining rule for sub-pieces (v1 grammar -P{n} extension vs new sequence) needs a ratified decision.
2 authority (Phase α col) GOV + INVARIANT Children of a split / the result of a merge need an authority value. INVARIANT: a successor cannot acquire higher authority than its predecessor without an explicit re-enactment path. GOV: the inheritance rule (does a split of an enacted unit yield enacted or draft children?) is undecided and must be ratified before P0-2 freezes manifest columns.
3 canonical_address_format_version (Phase α col) INVARIANT Successors inherit the predecessor's format version (canonical-address-v1). No per-operation decision; record as invariant: manifest carries it as a snapshot, never downgrades it.
4 source_span P0-2 A manifest_unit_block is a span. SPLIT partitions a parent span into non-overlapping, gap-free child spans; MERGE unions child spans. The block table MUST carry source_span now or P0-2 is rebuilt.
5 parent/child hierarchy (identity_profile) P0-2 (structure) + P1 (re-parenting execution) manifest_envelope must structurally model the grouping/parentage of its blocks now. The actual hierarchy rewrite engine (re-parenting identity_profile) is P1.
6 render_order P0-2 (column) + P1 (recompute logic) A block ordering field must exist in manifest_unit_block now. The recomputation algorithm on split/merge is P1.
7 manifest envelope / block P0-2 This is P0-2 (v0.2-D-1 / v0.2-D-2). It is the subject, not a dependency.
8 aliases INVARIANT + GOV Split/merge MUST emit previous_canonical / redirect alias rows at execution. INVARIANT: no topology change without an alias trail. GOV: whether the manifest stores an explicit alias linkage or alias rows are emitted purely event-backed at P1. Alias writes are P1 (alias table is live but empty; writes unauthorized).
9 review decisions (review_decision, P0-6) P1 / P0-6 + INVARIANT review_decision is a separate deferred item (v0.2-D-3, P0-6, blocked). INVARIANT: a split/merge proposal is review-gated — the manifest must expose an escalation_ref / review hook now, but the review table itself is deferred.
10 cut_change_set / verify_result links INVARIANT + P1 Split/merge is a CUT-class op: it MUST be expressed as a cut_change_set and proven via verify_result (both v0.1-live, empty). INVARIANT: no out-of-band metadata mutation outside a change-set. The manifest may carry a soft cut_change_set reference column now; actual CUT/VERIFY execution is P1 and currently unauthorized.

3.1 Summary counts

  • P0-2 (must carry structurally now): source_span; manifest envelope/block; (+ render_order column, hierarchy structure) → 4 facets touch P0-2 schema columns.
  • P1 (defer execution): re-parenting engine, render_order recompute, alias writes, CUT/VERIFY execution, full split/merge pipeline.
  • INVARIANT (carry as binding rules): canonical_address immutability + alias trail; authority non-escalation; format_version inheritance; review-gating; change-set-only mutation. → 6 invariants (see §3 rows 1,2,3,8,9,10).
  • GOV (needs decision before P0-2 freeze): address-coining rule for sub-pieces; authority inheritance rule; manifest↔alias linkage policy. → 3 governance decisions.

§4 — Open Items Surfaced (carried to absorption report)

  • GOV-1: canonical_address coining rule for split children / merge result.
  • GOV-2: authority inheritance rule across split/merge.
  • GOV-3: manifest↔alias linkage (explicit column vs purely event-backed).
  • These do not block authoring P0-2 structure, but must be resolved before P0-2 DDL is frozen. They are escalated in dot-iu-cutter-v0.2-br-6-absorption-report-2026-05-16.md §Open-Questions.

§5 — Hard Boundaries

no_DDL_written: TRUE
no_SQL_executed: TRUE
no_mutation: TRUE
no_migration: TRUE
no_production_touch: TRUE
no_manifest_design_started: TRUE
BR_6_self_closed: FALSE   (left PENDING GPT review)
output_form: br_6_td_analysis_only

End of BR-6 split/merge TD analysis.

Back to Knowledge Hub knowledge/dev/laws/dieu44-trien-khai/v0.2-planning/dot-iu-cutter-v0.2-br-6-split-merge-td-analysis-2026-05-16.md