KB-270B

dot-iu-cutter v0.5 WS-3 Cross-source Topic Assembly Logical Proof Design

20 min read Revision 1
dot-iu-cutterv0.5WS-3design-onlylogical-prooftopic-assembly

dot-iu-cutter v0.5 — WS-3 Cross-source Topic Assembly Logical Proof (Design)

Date: 2026-05-18 Phase: v0.5 WS-3 — cross-source topic assembly logical proof Nature: design_only_logical_proof — no schema, no table/view/function, no code, no dry-run, no production write, no Directus mutation, no edge creation, no APR approval. Authority consumed (read-only, NOT redesigned — QG1): WS-1 package (6 files) + WS-2 package (4 files) + GPT WS-2 review + P11D + P44-4A relation-edge-conformance.

Plain-language framing (QG7). "Topic assembly" = given one subject (here: how a collection is structured), pull every relevant unit of knowledge that lives in different kinds of source (the internal architecture constitution, an internal law, a process, the live database, the code, a report, a lesson), put them in a sensible order, label what role each plays (rule vs proof vs running-reality), and emit one coherent thread — without copying the underlying systems into the knowledge base and without inventing new graph edges. This file proves that the WS-1 + WS-2 logical contracts are sufficient to describe that operation on paper. It proves nothing physical.


0. Scope guard (read first)

ws3_proves: cross_source_topic_thread / topic_thread   # for ONE sample topic
ws3_does_NOT_prove: compliance_matrix (full)           # explicitly deferred — see gap/readiness file §C, QG9
sample_topic:
  topic_code: collection_structure        # snake_case, immutable (P11D §3.1)
  topic_name: "cấu trúc collection"
  topic_namespace: tac.structure          # logical placeholder namespace (P11D §3.1 style)
  topic_status: active                    # logical sample only
forbidden_respected: true
self_advance: PROHIBITED
all_addresses_in_this_file: LOGICAL PLACEHOLDER (QG6) — not extracted from production IU rows

1. The assembly_profile being proved

We instantiate one WS-1 assembly_profile of profile_type = cross_source_topic_thread (WS-1 brief G4 sketch). Field names are taken verbatim from the WS-1 G1 logical contract; values below are the logical proof instance. This is a logical contract instance, not a row, not a table.

assembly_profile:                       # LOGICAL INSTANCE — WS-1 G1 contract, not persisted
  profile_id: "ap.cross_source_topic_thread.collection_structure.LOGICAL"   # placeholder identity
  profile_type: cross_source_topic_thread          # resolve via registry (no enum hardcode)

  input_selector:
    by: topic_code
    topic_code: collection_structure
    # logical meaning: seed set = IUs carrying content_profile.topic_labels.topic_code == collection_structure
    # (Optional Enrichment per Đ44 §9.1 + P44-3 INV-P4 — topic label does NOT block birth gate)

  ordering_rule:
    kind: internal_architecture          # WS-1 G4 ordering_rule enum value
    authority_aware: true                # G4: ordering is authority-aware, NOT flat
    tier_order:                          # logical render tiering for THIS thread
      - normative_authority              # rules first (constitution → law → process)
      - implementation_authority         # then running reality (sql_entity / code_artifact)
      - evidence_authority               # then proof / retrospective (report → lesson)
    within_tier: source_numbering_ordinal_then_canonical_address  # legible source order

  inclusion_rule:
    - include IU if topic_code == collection_structure (core)
    - include IU reached via relation_filter during ENRICH (extended set)
    - for normative families: honor source_family.status_policy (enacted_only / draft_flagged)

  exclusion_rule:
    - exclude IU with lifecycle in {retired}
    - exclude IU superseded by an active successor (supersedes edge → keep successor only)
    - exclude raw↔raw adjacency that has no IU anchor (out of Fabric scope — see §4)

  relation_filter:                       # edge whitelist for ENRICH — see §3 + §4 for mechanism
    iu_to_iu_universal_edges:
      core:       [references, depends_on, compatible_with, contradicts, implements, supersedes]   # P11D §5.3.1 whitelist
      candidate:  [governed_by, derived_from]   # P44-4A §3.2 CANDIDATE — pending v1.0 ratification (dependency, see §6)
      apr_gated:  [evidenced_by]                # NOT in active filter — APR-gated, NOT approved (QG5/QG10, see §5)
    contains_excluded: true              # composition (Cap-5), not a Cap-4 relation (P11D §5.3.1)
    uses_optional_deferred: true         # P11D §5.3.1 ⚠ — defer to Pilot benchmark (P11D-ζ)

  source_family_filter:                  # GPT-mandated chain families for WS-3
    - internal_incomex_constitution
    - internal_incomex_law
    - internal_process
    - sql_entity
    - code_artifact
    - report
    - lesson
    # (architecture_note / external_government_law in the 9-seed list but out of THIS sample chain)

  provenance_policy:
    level: full_trace                    # P11D §7 + Q-T5
    require: [creator, method, timestamp, source_context]   # mandatory post-Đ44
    record_override_provenance: true     # WS-2 D4 authority_override.provenance must be carried (see §7)

  render_template: "pointer-only (Directus/Nuxt) — NOT authority (WS-1 G5)"   # logical pointer, not designed here
  output_format: "logical TopicThread shape (P11D §6.2) — not an API/UI contract"
  lifecycle: proposed                    # logical state of this profile instance
  created_by: "WS-3 logical proof (design-only)"
  created_at: "2026-05-18 (logical)"

QG2 note: every block in this file is YAML / pseudo-query. There is no executable SQL anywhere.


2. Engine = generic executor over P11D §6.1 pipeline (OD-FA8)

Per WS-1 OD-FA8 (rec. (a) generic executor, defer_WS3 for code — no code now), the cross_source_topic_thread profile is not a bespoke pipeline. It is the single P11D §6.1 pipeline parameterised by the profile above, run over the union of:

  1. universal_edges rows where both endpoints are governed Objects/IUs (OQC ≥ 3/4) — §4 mechanism A;
  2. iu_entity_binding rows where one side is a raw_entity — §4 mechanism B;
  3. assembly-local relations computed at assembly time, output-only, not persisted, no APR — §4 mechanism C.

P11D §5.3.1 edge whitelist is respected for the ENRICH step.


3. Logical pipeline run — GATHER → ENRICH → GROUP → QUALITY → PROVENANCE

Stage names and order are taken verbatim from P11D §6.1. Each stage is expressed as a pseudo-query (logical, not executable — QG2).

Step 1 — GATHER (P11D Q-T1: Topic → core unit set)

pseudo-Q-T1:
  SELECT iu  WHERE iu.content_profile.topic_labels[*].topic_code == 'collection_structure'
  → CORE_SET

Logical result (placeholders — see sample-chain file for the full chain):

CORE_SET (logical placeholder):
  - N1  source_family=internal_incomex_constitution   addr=ICX-CONST/NT-7-COLLECTION-ARCH
  - N2  source_family=internal_incomex_law             addr=ICX-LAW/DIEU-12-COLLECTION-STRUCTURE
  - N3  source_family=internal_process                 addr=ICX-PROC/QT-COLLECTION-PROVISION-S3
  - N6  source_family=report                           addr=RPT-COLLSTRUCT-2026/SEC-FINDINGS
  - N7  source_family=lesson                            addr=LSN-COLLSTRUCT-2026/L-1
  # N4 (sql_entity) and N5 (code_artifact) are raw_entity → NOT IU → enter via ENRICH binding, not GATHER

Plain terms: GATHER only collects things that are themselves governed units of knowledge (IUs). The live database schema and the code file are not IUs — they enter the thread later through a binding, not here.

pseudo-Q-T3:
  FOR each iu in CORE_SET:
    FOLLOW universal_edges WHERE edge_type IN relation_filter.iu_to_iu_universal_edges.core
                                            ∪ candidate(governed_by, derived_from)   # candidate-gated, §6
    ANNOTATE each reached unit with relation_kind
    EXCLUDE contains (composition); INCLUDE contradicts WITH opposing annotation (P11D §5.3.1, GPT Q-D2)
  FOR each iu in CORE_SET:
    FOLLOW iu_entity_binding  → attach raw_entity refs (N4 sql_entity, N5 code_artifact) by reference only
  COMPUTE assembly-local relations (output-only adjacency, NOT persisted, NO APR)
  → EXTENDED_SET = CORE_SET ∪ related-IUs ∪ bound-raw-entity-refs, each annotated with {relation_kind, mechanism}

Edges traversed in this sample (mechanism column is mandatory — QG4; full per-relation table is in the sample-chain file §4):

From To relation_kind Mechanism Status
N2 law N1 constitution governed_by universal_edges (IU↔IU) CANDIDATE edge — dependency-gated (§6)
N3 process N2 law implements universal_edges (IU↔IU, core) active core edge
N3 process N4 sql_entity (raw) binding_kind=implements iu_entity_binding active (raw side, §4-B)
N3 process N5 code_artifact (raw) binding_kind=implements iu_entity_binding active (raw side, §4-B)
N4 sql_entity N5 code_artifact (adjacency in thread) assembly-local (output-only, not persisted) no APR (§4-C)
N6 report N2/N3 (requirement) evidence-of APR-gated evidenced_by OR alternative NOT approved — §5
N7 lesson N6 report derived_from (lesson_from) universal_edges (IU↔IU) CANDIDATE edge — dependency-gated (§6)

Step 3 — GROUP (P11D Q-T2: group by publication, then by authority tier)

pseudo-Q-T2:
  GROUP EXTENDED_SET BY publication (D28 / D32 / D35 logical)   # P11D Q-T2 native grouping
  THEN re-tier BY authority_semantics per ordering_rule.tier_order:
       normative_authority → implementation_authority → evidence_authority
  AGGREGATE per group: count, lifecycle mix, source_family mix

Plain terms: first group the way P11D already groups (by publication), then arrange those groups so the rules come first, the running reality next, and the proof last — because for a cross-source thread the reader wants "what must be true → what is actually built → what proves it".

Step 4 — QUALITY (P11D Q-T4: flag missing metadata / low confidence)

pseudo-Q-T4:
  FLAG iu WHERE topic_confidence < threshold (P11D-γ deferred — threshold NOT decided here)
  FLAG iu WHERE missing mandatory provenance (post-Đ44)
  FLAG node WHERE authority_semantics ambiguous AND no authority_override present (→ mixed-authority candidate)
  SPLIT needs_review SUBSET (does not drop nodes; marks them)

Note: a node flagged "authority ambiguous" is exactly the mixed-authority case handled by WS-2 D4 authority_override (proved in the sample-chain file via N2, QG8).

Step 5 — PROVENANCE (P11D Q-T5: attach trace)

pseudo-Q-T5:
  FOR each node: ATTACH provenance {creator, method, timestamp, source_context}
  FOR each authority_override: ATTACH override.provenance {set_by, set_at, reason, source}   # WS-2 D4
  FOR each raw_entity ref: ATTACH binding.provenance + entity_reference_registry.authority_note
  FOR each candidate/APR-gated relation: ATTACH gating note (candidate-not-activated / APR-not-approved)
  → Output: TopicThread object (logical shape, P11D §6.2)

4. Where each mechanism is used (QG4 — explicit)

Per the WS-1 binding-authority clarification note §2 decision table and the edge-APR-minimization note:

mechanism_A_universal_edges:
  when: BOTH endpoints are governed Objects/IUs (OQC >= 3/4, P44-4A §2.1)
  used_here_for:
    - N2 law  --governed_by-->  N1 constitution   # 'constrains' resolved to reuse governed_by (edge-APR note bucket 1)
    - N3 process --implements--> N2 law            # core edge, IU<->IU
    - N7 lesson --derived_from--> N6 report         # 'lesson_from' resolved to reuse derived_from + source_family=lesson
  note: governed_by & derived_from are P44-4A §3.2 CANDIDATE edges (pending v1.0) — see §6 dependency

mechanism_B_iu_entity_binding:
  when: ONE side is a raw_entity (not an IU; does NOT satisfy OQC; not in relation graph)
  used_here_for:
    - N3 process --binding_kind=implements--> N4 sql_entity (PG schema / Directus collection)
    - N3 process --binding_kind=implements--> N5 code_artifact (git file / code module)
  rule: bind by reference via entity_reference_registry (entity_ref_id, entity_kind, source_system,
        natural_key, authority_note) + provenance; read live from SSOT at assembly time; do NOT copy
        SQL/business/code data into IU text (Option D hybrid, OD-FA2)

mechanism_C_assembly_local:
  when: a relation that exists ONLY inside the assembly_profile output (relation_filter/annotation),
        computed at assembly time, NOT persisted as a universal_edges row
  used_here_for:
    - N4 sql_entity <-> N5 code_artifact adjacency in the rendered thread
      (raw<->raw is out of universal_edges scope; the "same-topic" adjacency is output-only grouping)
    - topic-grouping cohesion of the whole thread (the "thread" itself is an assembly-local construct)
  governance: NO APR, NO edge_type vocab change (edge-APR-minimization note)

Edge-minimization order respected: reuse > iu_entity_binding > assembly-local > new edge.


5. evidenced_by handling (QG5 + QG10 — APR-gated, NOT approved)

The report node (N6) carries an evidentiary relation to the normative/requirement nodes ("this report attests the collection structure requirement was satisfied"). The natural edge for this is evidenced_by.

evidenced_by_in_this_proof:
  status: APR-GATED
  approved: false                      # NOT approved by this file; APR reserved for GPT/User
  created: false                       # no edge type created
  appears_as: "logical example only"   # per GPT WS-2 review directive
  classification: P44-4A §3.3 extension, OD-FA4 bucket 3 = true new durable edge,
                  minimized set = { evidenced_by } (exactly 1, edge-APR-minimization note)
  governance_required_before_use: APR cấp medium (Đ32) + amend edge_type vocab framework (NT4)
  reverse_type_if_ever_approved: evidences

alternative_paths_if_evidenced_by_NOT_available:   # QG10 requires stating these
  A_iu_entity_binding:
    when: the proof target is a raw evidence entity (verification artifact path / report file as raw)
    use: iu_entity_binding with binding_kind=evidences (+ entity_reference_registry)
    apr_needed: false
  B_assembly_local_relation:
    when: output-only "this unit is evidenced-by that report" annotation inside the thread
    use: assembly-local relation in the cross_source_topic_thread output (NOT persisted)
    apr_needed: false
  C_reuse_references_weakest:
    when: only a read-only mention is needed (NO attestation semantics)
    use: reuse core edge `references` — explicitly weaker; loses "proves performed" meaning
    apr_needed: false
  chosen_for_this_logical_proof: B (assembly-local) — keeps WS-3 runnable WITHOUT the new edge

Plain terms: we are not allowed to create or approve evidenced_by. So in this paper proof we show the evidence link as an output-only annotation (alternative B). If a durable, queryable "X is proven by Y" edge is ever truly needed, that is a separate APR decision for GPT/User — flagged, not taken here.


6. governed_by / derived_from candidate dependency (non-self-resolved)

The edge-minimization result reuses governed_by (for constrains, node N2→N1) and derived_from (for lesson_from, node N7→N6). Both are P44-4A §3.2 CANDIDATE edge types pending User v1.0 ratification.

dependency:
  governed_by: P44-4A §3.2 candidate — reverse `governs` — NOT yet activated
  derived_from: P44-4A §3.2 candidate — reverse 1-directional — NOT yet activated
  GPT_ruling: dependency_not_blocker_for_WS1_authority
  but: if candidates are NEVER activated, the edge-minimization resolutions
       (constrains→governed_by, lesson_from→derived_from) MUST be revisited
       BEFORE WS-3 execution (risk R-WS2-2)
  ws3_position: FLAGGED, not self-resolved. WS-3 is a logical proof; it does not
                require the candidate edges to be physically active to be valid on paper.
                If they stay inactive, the two relations fall back to assembly-local
                (output-only) representation — proof still holds, queryability reduced.

This is logged in the gap/readiness file as a non-blocker for the logical proof but a must_resolve_before_production item.


7. Authority labelling per node (WS-1 §4 binding rule)

WS-1 binding-authority §4: assembly output MUST label EACH included unit as exactly one of normative_authority | evidence_authority | implementation_authority for that assembly context, default from source_family registry, replaced by unit/span authority_override (WS-2 D4) when present.

default_authority_from_source_family (WS-2 D2 seed):
  internal_incomex_constitution -> normative_authority
  internal_incomex_law          -> normative_authority
  internal_process              -> normative_authority
  sql_entity                    -> implementation_authority
  code_artifact                 -> implementation_authority
  report                        -> evidence_authority
  lesson                        -> evidence_authority

override_applied_in_sample (QG8): YES — node N2 (internal_incomex_law) carries a
  span-level authority_override (law body = normative_authority, explanatory
  appendix span = evidence_authority). Full mechanics in the sample-chain file §5.
  Span-level mechanism is a LOGICAL example; detailed span mechanics remain
  deferred to a real mixed-authority pilot (WS-2 D4).

8. What this proof establishes / does not establish

established_on_paper:
  - WS-1 assembly_profile (cross_source_topic_thread) field set is SUFFICIENT to express
    a cross-source thread for one topic.
  - WS-2 source_family registry + authority_semantics default + D4 override is SUFFICIENT
    to label every node's role, including a mixed-authority node.
  - The 3 mechanisms (universal_edges / iu_entity_binding / assembly-local) cover every
    relation in the chain WITHOUT creating an edge type and WITHOUT approving evidenced_by.
  - P11D §6.1 pipeline (GATHER→ENRICH→GROUP→QUALITY→PROVENANCE) runs end-to-end logically
    over the union, with P11D §5.3.1 whitelist respected.
  - Canonical address scheme <DOCPREFIX>/<L1>-<L2>-...-<Lk> is expressible for every IU node.

NOT established (out of WS-3 scope — see gap/readiness file):
  - compliance_matrix full logical proof  (DEFERRED, QG9)
  - any physical schema / table / view / index / code
  - evidenced_by as an approved/queryable edge
  - candidate edge (governed_by/derived_from) physical activation

9. Forbidden respected / next

design_only: true
schema_migration: none
table_view_function_index: none
production_write: none
code_change: none
directus_mutation: none
edge_type_created: none
evidenced_by_approved: false
dry_run: none
git_commit: none
self_advance: PROHIBITED
next: STOP after 4-file upload → route to GPT/User review
Back to Knowledge Hub knowledge/dev/laws/dieu44-trien-khai/v0.5-fabric-addendum-scope/dot-iu-cutter-v0.5-WS3-cross-source-topic-assembly-logical-proof-design-2026-05-18.md