KB-3DD6

dot-iu-cutter v0.1 — Assembly Axes and Metadata Contract

9 min read Revision 1
dot-iu-cutterdesignassembly-axesmetadata-contractrev5d

dot-iu-cutter v0.1 — Assembly Axes & Metadata Contract (D6)

Date: 2026-05-15 Status: DESIGN DRAFT Baseline: rev5d §5, §7.J Scope: DESIGN ONLY.


1. Purpose

Define the metadata contract every IU must satisfy so the system supports both assembly axes as first-class:

  • Axis-1: Document Reconstruction — render units back to the original document with 0 drift.
  • Axis-2: Semantic Domain Assembly — assemble units across documents by professional domain.

A segmentation decision that supports only one axis is incomplete (rev5d §5.3, P13).

2. Scope

  • Axis-1 metadata fields
  • Axis-2 metadata fields
  • Verification methods per axis
  • Edge readiness hooks (Đ39)
  • UOSL compatibility hooks (Đ44)
  • KG feedback hooks (criterion 19)

Out of scope: thread membership lifecycle (D9); profile mapping in detail (D7); legal review (D10).

3. Dependencies

  • rev5d §5, §7.J, §13.3
  • D1, D2 (manifest carries these fields)
  • D9 (thread consumes axis-2 metadata)
  • C1A (boundaries determined by C1A)
  • Đ24 (vocabulary), Đ39 (universal_edges), Đ44 (UOSL), Đ0-G (birth gate)

4. Key Decisions

4.1 Both Axes Are First-Class (P13; criterion 23, 24)

Every IU must carry metadata sufficient for:

  • Axis-1: reconstruct the source document with 0 drift.
  • Axis-2: participate in cross-document professional-domain assembly.

A unit missing axis-2 metadata is incomplete even if axis-1 round-trip passes.

4.2 Axis-1 Metadata (Q31; criterion 23)

Required fields per IU (logical; placement per current TAC schema + gaps):

Field Purpose
source_path Origin source identifier
source_revision Exact revision at cut time
source_span_start / source_span_end Byte/line span in source
render_order Stable order for reconstruction
parent_unit_id Canonical parent (single canonical parent rule)
canonical_address Stable human-readable address (e.g. "Đ44 §5.3.1")
publication_membership Which publication(s) include this unit
body_source_policy inline / container / referenced / generated

4.3 Axis-2 Metadata (Q32; criterion 24, 30)

Required fields per IU for semantic domain assembly:

Field Purpose
section_type Đ24 vocabulary
unit_kind Đ24 vocabulary
classification_labels Đ24-controlled labels
semantic_role Conceptual function within domain
candidate_edges Pre-marked edges to existing units (hook for Đ39 universal_edges)
edge_readiness_notes Why edges are or aren't ready
universal_edges_compat_flag Indicates the unit is shaped for Đ39 reuse
vector_projection_readiness Hook for Qdrant indexing
thread_hint Optional system-discovered or user-directed thread reference
lifecycle_stage_hint law / design / code / report / runbook etc.

4.4 Verification per Axis (Q33; criterion 5)

Axis-1 verification: mandatory round-trip.

Render units by publication membership, by render_order
  → Compare to source revision content (byte-equivalent or canonical-form equivalent)
  → 0 drift = PASS; any drift = FAIL → rollback (D1 §4.8)

Axis-2 verification: semantic assembly test cases.

Define N assembly queries (e.g., "find all design IUs for thread X")
  → Run query → expected canonical units returned
  → Coverage threshold per policy
  → Below threshold → axis-2 FAIL (advisory in v0.1; mandatory hook)

In v0.1, axis-2 verification is a hook with advisory failure; structural reject is reserved until policy maturity. Hook is mandatory from day one.

4.5 KG Feedback Hooks (criterion 19)

candidate_edges and edge_readiness_notes are explicit hooks for KG enrichment. Even before universal_edges is fully wired:

  • candidate_edges carries proposed edge targets discovered at MARK time.
  • edge_readiness_notes documents why the edge is candidate (e.g., similarity, citation, semantic_role alignment).
  • These feed the Semantic Intake Flow (D9 §4.3) once threading is operational.

4.6 UOSL Compatibility Hooks (Q39 — supporting; criterion 25)

Each unit metadata is mapped (at v0.1, conceptually) to UOSL G1–G12 field groups:

Axis-1/Axis-2 field UOSL group hint
canonical_address G1 (identity)
section_type / unit_kind G2 (classification)
parent_unit_id / hierarchy G3 (relations)
publication_membership G4 (publication)
risk_class / authority G5 (governance state)
candidate_edges Relation Layer / universal_edges
semantic_role / labels G6 (semantics)

Mappings are documented in D7; gaps recorded.

4.7 No Mechanical Splitting; C1A Authority (P1)

Axis-2 metadata richness does NOT justify making units smaller for the sake of granularity. C1A 3-question test remains authoritative.

4.8 Vocabulary Discipline (Đ24)

section_type, unit_kind, classification_labels, semantic_role, lifecycle_stage_hint — all from Đ24 vocabulary. Gaps → backlog (D5), not silent invention.

4.9 PG-Driven (P14)

All axis-1 and axis-2 fields are PG-persisted on tac_logical_unit (or in JSON profile until canonical column exists). Markdown is mirror.

4.10 Diff Discipline

Field changes across manifest versions are diffable (D2 §4.11). Axis-1 changes (e.g., render_order) are usually structural; axis-2 changes (e.g., new label) are usually enrichments. The diff classifies the change for governance routing.

5. PG Storage per Object (Design Intent — No DDL)

All axis-1 and axis-2 fields live on tac_logical_unit (existing) plus its JSON profile envelope, PLUS the schema gaps below for fields not currently present.

Object Target DB Layer Notes
tac_logical_unit (extended) directus (TAC schema) Kho Existing; needs new fields per §6
unit_classification_label directus Não Đ24 label rows
candidate_edge directus Não Prefer universal_edges with status='candidate'; else flag gap
semantic_role_dictionary directus Não Đ24 vocabulary
lifecycle_stage_hint_dictionary directus Não Đ24 vocabulary

6. Schema Gaps

  1. canonical_address — first-class column on tac_logical_unit (current presence unclear).
  2. semantic_role — field; vocabulary placement per Đ24.
  3. classification_labels — multi-valued; current shape unclear (JSON?).
  4. candidate_edges — prefer universal_edges(status='candidate'); if not possible, flag.
  5. edge_readiness_notes — JSONB on unit or separate table.
  6. universal_edges_compat_flag — explicit indicator.
  7. vector_projection_readiness — hook for Qdrant; integration gap.
  8. thread_hint — pre-thread membership hint; new field.
  9. lifecycle_stage_hint — Đ24 vocabulary needed.
  10. render_order — guaranteed stable across cut cycles; current behavior unclear.
  11. Axis-2 verification policy — test query catalog and coverage thresholds.

7. Law References

Surface Law
Boundary rules C1A
Vocabulary Đ24
Universal edges Đ39
UOSL compat Đ44
Birth gate Đ0-G
PG placement Đ33 / Đ43

8. Open Questions

  1. Should axis-2 verification block CUT in v0.1, or remain advisory? Recommendation: advisory; promote later via D4 intake.
  2. How are candidate_edges numerically scored at MARK time? Defer to a candidate-edge specification subdoc.
  3. Should classification_labels cardinality be capped? Defer to Đ24.

9. Coverage

Questions covered (primary): Q31, Q32, Q33. Questions covered (secondary): Q4, Q34.

Acceptance criteria covered:

  • 19 (KG feedback hooks)
  • 23 (axis-1 reconstruction)
  • 24 (axis-2 semantic assembly)
  • 25 (UOSL/Đ44 mapping — supporting D7)
  • 30 (metadata for cross-doc assembly)

Schema gaps: 11 named (see §6).

Law dependencies: C1A, Đ24, Đ33/Đ43, Đ39, Đ44, Đ0-G.

Open questions: 3 (see §8).

Law conflicts encountered: none.

Back to Knowledge Hub knowledge/dev/laws/dieu44-trien-khai/design/dot-iu-cutter-v0.1-assembly-axes-metadata-contract-2026-05-15.md