KB-1FA6 rev 7

P3D Pack 1 Phase 5B — Hybrid Nesting + Species + Pilot Migration Design (rev2)

14 min read Revision 7
p3dpack1phase5bdesignrev2d3-hybridspeciespilotdieu-35gpt-accepted

P3D Pack 1 Phase 5B — Hybrid Nesting + Species + Pilot Migration Design (rev2)

Date: 2026-05-11 | Rev2: 2026-05-11 (GPT partial-accept + corrections applied) Author: Opus 4.7 (drafter + critic) Status: DESIGN — GPT ACCEPTED WITH CORRECTIONS (rev2). No migration, no seed, no backfill, no Agent dispatch. Strategy: D3 HYBRID — GPT LOCKED Pilot: DIEU-35 — GPT LOCKED Phase 5C: SPLIT into 5C1 (species/backfill) + 5C2 (DIEU-35 pilot migration) per GPT directive Mode: DESIGN + EXECUTION-PROMPT-DRAFT ONLY GPT review: reviews/gpt-review-p3d-pack1-phase5b-design-partial-accept-prompt-draft-not-approved-2026-05-11.md GPT directive: directives/gpt-directive-opus-p3d-pack1-phase5b-rev2-split-prompts-2026-05-11.md Rev1 prompt: NOT APPROVED (merged into single dispatch, hardcoded numeric gates, SQL illustrative became hardcode). Rev2 splits into 5C1 + 5C2 DRAFT prompts.


GPT-LOCKED DECISIONS (rev2)

These are no longer open questions. They are design constraints for all downstream artifacts.

nesting_strategy                     = D3_HYBRID
hierarchy_carrier_primary            = identity_profile_json (D3a)
hierarchy_carrier_secondary          = universal_edges_deferred_to_post_pilot (D3b)
d3c_parent_as_pointer                = REJECTED (indistinguishable from D1 at DB level)
parent_or_container_ref_for_pilot    = NULL
composition_for_pilot                = atom
unit_version_counts_as_containment   = false (convention: subordinates ≠ contained entities per Điều 0-B)
pilot_publication                    = DIEU-35
collection_governance_role_change    = defer_keep_observed
uv_species_mapping                   = no_for_now
universal_edges_enrichment           = defer_to_post_pilot
iu_publication_member_table          = defer_to_post_pilot
rollback_capture                     = KB_report_artifact_plus_VPS_log_no_new_control_table
phase_5c_structure                   = SPLIT_5C1_species_backfill_THEN_5C2_pilot_migration

STILL UNRESOLVED (placeholders for GPT/User)

species_exact_code                   = <PENDING GPT/USER — Opus proposal: 'information_unit_atom'>
species_entity_code                  = <PENDING GPT/USER — Opus proposal: 'SPE-IUA'>
species_display_name                 = <PENDING GPT/USER — Vietnamese>
species_prefix                       = <PENDING GPT/USER>
species_depth_in_taxonomy            = <PENDING GPT/USER — Opus proposal: depth=1 under infrastructure root>
species_parent_id                    = <PENDING GPT/USER>
species_kg_metadata                  = <PENDING GPT/USER>
publication_authority_ref_value      = <PENDING GPT/USER — birth-gate-required key; blocks 5C2 only, not 5C1>

A. Executive summary

Phase 5A dry-run rev6 closed with full evidence: 86 TAC logical units across 3 publications (DIEU-28=27, DIEU-32=23, DIEU-35=36), nesting depth=2, render_order contiguous, 0 address collisions, hash recipe divergence confirmed, birth gates satisfiable, 40 live species but none semantically law_unit.

GPT has locked D3 HYBRID: IU rows stay atomic, TAC parent-child in identity_profile JSON, parent_or_container_ref NULL, composition=atom, governance_role stays observed.

Rev2 key change: Phase 5C execution is SPLIT into two separate DRAFT prompts per GPT directive:

  • 5C1: QT-005 species/mapping prep + QT-001 backfill (no TAC migration)
  • 5C2: DIEU-35 pilot migration (requires 5C1 completed and accepted)

Rev1 prompt was NOT APPROVED because it: (a) merged all steps into one dispatch; (b) used hardcoded numeric gates from Phase 5A evidence (36, 12, 35); (c) contained SQL that could become execution hardcode; (d) left species identity and publication_authority_ref unresolved.


B. Accepted evidence from Phase 5A

(Unchanged from rev1 — all numbers from reports/p3d-pack1-phase5-tac-to-iu-migration-dryrun-report.md rev6 PASS.)

B.1 Source/target inventory

TAC logical units    : 86    (Phase 5A evidence — NOT an executable gate)
TAC unit versions    : 86
TAC publications     : 3
Current IU rows      : 12    (Phase 5A evidence — NOT an executable gate)
Current UV rows      : 19
Address collisions   : 0

B.2 Per-publication metrics (Phase 5A evidence — NOT executable gates)

doc_code version members section_types depth render_order
DIEU-28 v2.0 27 7 2 [0..26]
DIEU-32 v1.1 23 8 2 [0..22]
DIEU-35 v5.2 36 12 2 [0..35]

All numbers above are Phase 5A evidence for reference. Executable prompts MUST re-derive counts live.

B.3–B.6

(Unchanged from rev1: nesting facts, hash/provenance, species landscape, birth gate readiness. Refer to rev1 §B.3–B.6.)


C. Why D3 hybrid — GPT LOCKED

(Reasoning unchanged from rev1 §C. Five GPT reasons + Opus exit-ramp rationale all accepted.)


D. D3 hybrid data model — GPT LOCKED

D.1 Sub-options (decision CLOSED)

Sub-option Status
D3a (identity_profile JSON) GPT LOCKED as primary
D3b (universal_edges) GPT LOCKED as deferred post-pilot
D3c (parent_or_container_ref pointer) GPT REJECTED

D.2 Identity_profile JSON keys (D3a hierarchy carrier)

Migration populates these keys. Format (column names are illustrative — Agent resolves live via registry):

-- PATTERN ONLY — Agent resolves all column names via semantic field registry before execution

information_unit.identity_profile = {
  // Birth-gate-required keys (existing):
  "title"                          : <from source version title>,
  "owner_lookup_ref"               : <from source owner field>,
  "primary_section_type_ref"       : <vocab.section_type.{live section_type value}>,
  "publication_authority_ref"      : <PENDING GPT/USER — hard blocker for 5C2>,
  "publication_type_ref"           : <vocab.publication_type.{live publication_type value}>,
  // D3a hierarchy carrier keys (Phase 5B addition):
  "tac_parent_anchor"              : <source parent_id as text, or null for roots>,
  "tac_parent_canonical_address"   : <parent's canonical_address via lookup, or null>,
  "publication_anchor"             : <source doc_code>,
  "publication_render_order"       : <integer from source render_order>,
  "tac_sort_order"                 : <integer from source sort_order>
}

D.3 parent_or_container_ref = NULL (GPT LOCKED)

All migrated IU rows have NULL parent_or_container_ref under D3. Hierarchy is in JSON only.

D.4 Render fidelity model

Nuxt Laws Page reconstructs tree from identity_profile JSON keys via recursive CTE on tac_parent_canonical_address. Verifiable in Phase 5C2 render-fidelity test.


E. Species/composition — GPT LOCKED (atom) + species identity PENDING

E.1 Composition = atom (LOCKED)

GPT accepts: UV/subordinates do not count as contained entities per Điều 0-B convention. All migrated IU rows have composition=atom under D3.

E.2 Species approach = new species (LOCKED direction, identity PENDING)

GPT accepts: new species likely needed. Opus recommends E.2.A (single species covering all IU unit_kinds). Exact species code, display name, prefix, taxonomy placement remain UNRESOLVED placeholders. Must follow QT-005 with live schema introspection — no hardcoded INSERT columns.

E.3 governance_role = keep observed (LOCKED)

No QT-005 promotion step during pilot. Promotion deferred to Phase 5E or later.


F. QT-005 species creation — 5C1 scope

QT-005 procedural steps (from Phase 4B addendum §F) execute in Phase 5C1. Species creation uses live schema introspection of entity_species table to discover required columns, constraints, and taxonomy tree. No hardcoded INSERT column list — Agent introspects information_schema.columns WHERE table_name='entity_species' and builds INSERT dynamically.

Species identity values come from GPT/User resolution of §STILL-UNRESOLVED placeholders.

Reversibility: DELETE by captured id.


G. QT-001 backfill — 5C1 scope

Backfill targets existing IU birth_registry rows where species_code IS NULL. Target count is LIVE-DERIVED (not hardcoded "12"). QT-001 5-step procedure with exact-key capture. Rollback via captured keys only.

Sequence within 5C1:

  1. Species seed (§F)
  2. Verify mapping exists
  3. Count backfill targets live
  4. Execute backfill with RETURNING capture
  5. Verify 0 NULL remaining

H. Pilot DIEU-35 — 5C2 scope

(Reasoning unchanged from rev1 §H.1–H.4.)

DIEU-35 is pilot because of highest diversity evidence. Member count, section_type count, and render_order metrics are LIVE-RE-DERIVED in 5C2, not hardcoded from Phase 5A.


I. Pilot migration algorithm — 5C2 scope (rev2: no hardcoded gates)

I.1 Preflight (read-only)

All preflight checks use LIVE queries. No hardcoded expected values.

PF-1  Species mapping exists: live count from species_collection_map for target_collection_primary → must be ≥ 1 with is_primary=true.
PF-2  Backfill complete: live count of NULL species in birth_registry for target_collection_primary → must be 0.
PF-3  Vocab coverage: for each DISTINCT section_type in source pilot set (live), verify vocab key exists.
PF-4  Publication_type vocab: verify pilot publication_type has vocab entry.
PF-5  fn_iu_create signature: live introspect pg_catalog.
PF-6  Address collision: live join source pilot set ↔ target IU canonical_address → must be 0.
PF-7  Identity_profile dry-construct: build one sample JSON, verify 5 birth-gate keys present.

I.2 Per-row migration

-- PATTERN ONLY — Agent resolves all column names via semantic field registry

source_set := live SELECT from source tables WHERE publication doc_code = pilot_doc_code
              ORDER BY publication_render_order ASC;

source_count := count of source_set;  -- live, NOT hardcoded

BEGIN;
FOR each row in source_set LOOP
  -- Build identity_profile JSON (§D.2 pattern)
  -- Call fn_iu_create with p_parent_ref := NULL (D3 LOCKED)
  -- Capture returned IU id + UV id (shape verified on first row)
  -- Patch identity_profile with D3a hierarchy keys
  -- Patch unit_version.content_profile with TAC provenance (source_hashes.tac_v1)
  -- Append ids to rollback capture arrays
END LOOP;

-- Post-loop assertions (ALL live-derived):
ASSERT inserted_iu_count = source_count;
ASSERT inserted_render_orders are contiguous with no duplicates (live min/max/distinct);
ASSERT 0 NULL species in birth_registry for inserted IU ids;
ASSERT fn_iu_verify_invariants returns OK for each inserted IU;
ASSERT content_hash = fn_content_hash(body) for each inserted UV;

-- Persist rollback capture to VPS log + KB report artifact
COMMIT;

I.3 Render fidelity (post-COMMIT, read-only)

Reconstruct tree from migrated IU identity_profile → compare with TAC source tree. Body text comparison. Render order comparison. Expected drift: 0.


J. Safety gates

(Unchanged from rev1 but with rev2 correction: all G-AS assertions use live counts, not hardcoded.)


K. Rollback model — GPT LOCKED (KB + VPS log)

K.1 Rollback capture persistence (LOCKED)

Agent writes captured key lists to:

  • KB artifact: reports/p3d-pack1-phase5c{1|2}-rollback-keys-<date>.md
  • VPS log: /opt/incomex/logs/p3d-pack1-phase5c{1|2}-rollback-keys-<date>.log

No new DB control table in this pack.

K.2–K.4 Exact-key rollback patterns

(Unchanged from rev1 §K.1–K.4: per-step, per-pack rollback via captured keys. Pattern-matching rollback PROHIBITED.)


L. Post-implementation design requirement

(Unchanged from rev1 §L.)


M. Open questions / decisions status (rev2)

CLOSED (GPT LOCKED)

# Question Decision Rev2 action
M.1 D3 sub-implementation D3a primary, D3b deferred, D3c rejected LOCKED
M.2 Composition atom vs molecule atom; UV ≠ contained entity LOCKED
M.5 12 existing IU backfill Yes, same species, via 5C1 QT-001 LOCKED
M.6 governance_role promotion Defer, keep observed LOCKED
M.7 render_order carrier identity_profile.publication_render_order LOCKED
M.8 universal_edges readiness Deferred post-pilot LOCKED

STILL OPEN (require GPT/User resolution before dispatch)

# Question Blocks Opus proposal
M.3 Species exact code/name/prefix/parent 5C1 dispatch information_unit_atom / SPE-IUA
M.4 Single species E.2.A vs per-unit_kind E.2.B 5C1 dispatch E.2.A (single)
M.9 (new) publication_authority_ref value/source 5C2 dispatch only Fixed scope constant or live lookup from source

N. Phase boundaries (rev2: split 5C into 5C1 + 5C2)

Sub-phase Scope Depends on Allowed actions
5B THIS design (rev2) + 2 DRAFT prompts + report Phase 5A PASS Design only
5C1 Species seed (QT-005) + QT-001 backfill 5B accepted + species identity locked by GPT Agent dispatch; metadata INSERT/UPDATE only
5C2 DIEU-35 pilot migration (D3a hybrid) 5C1 completed + publication_authority_ref locked Agent dispatch; IU/UV migration
5D Full batch (DIEU-28 + DIEU-32) 5C2 accepted Reuse 5C2 algorithm
5E Post-implementation design + optional D3b edges + optional IU publication_member table 5D accepted Documentation + follow-on

Phase 5B Design rev2 | D3a LOCKED | atom LOCKED | Split 5C1+5C2 | No hardcoded gates | Species identity PENDING | 2026-05-11

Back to Knowledge Hub knowledge/dev/laws/dieu44-trien-khai/design/p3d-pack1-phase5b-hybrid-nesting-species-pilot-migration-design.md