P3D Pack 1 Phase 5B — Hybrid Nesting + Species + Pilot Migration Design (rev2)
P3D Pack 1 Phase 5B — Hybrid Nesting + Species + Pilot Migration Design (rev2)
Date: 2026-05-11 | Rev2: 2026-05-11 (GPT partial-accept + corrections applied) Author: Opus 4.7 (drafter + critic) Status: DESIGN — GPT ACCEPTED WITH CORRECTIONS (rev2). No migration, no seed, no backfill, no Agent dispatch. Strategy: D3 HYBRID — GPT LOCKED Pilot: DIEU-35 — GPT LOCKED Phase 5C: SPLIT into 5C1 (species/backfill) + 5C2 (DIEU-35 pilot migration) per GPT directive Mode: DESIGN + EXECUTION-PROMPT-DRAFT ONLY GPT review:
reviews/gpt-review-p3d-pack1-phase5b-design-partial-accept-prompt-draft-not-approved-2026-05-11.mdGPT directive:directives/gpt-directive-opus-p3d-pack1-phase5b-rev2-split-prompts-2026-05-11.mdRev1 prompt: NOT APPROVED (merged into single dispatch, hardcoded numeric gates, SQL illustrative became hardcode). Rev2 splits into 5C1 + 5C2 DRAFT prompts.
GPT-LOCKED DECISIONS (rev2)
These are no longer open questions. They are design constraints for all downstream artifacts.
nesting_strategy = D3_HYBRID
hierarchy_carrier_primary = identity_profile_json (D3a)
hierarchy_carrier_secondary = universal_edges_deferred_to_post_pilot (D3b)
d3c_parent_as_pointer = REJECTED (indistinguishable from D1 at DB level)
parent_or_container_ref_for_pilot = NULL
composition_for_pilot = atom
unit_version_counts_as_containment = false (convention: subordinates ≠ contained entities per Điều 0-B)
pilot_publication = DIEU-35
collection_governance_role_change = defer_keep_observed
uv_species_mapping = no_for_now
universal_edges_enrichment = defer_to_post_pilot
iu_publication_member_table = defer_to_post_pilot
rollback_capture = KB_report_artifact_plus_VPS_log_no_new_control_table
phase_5c_structure = SPLIT_5C1_species_backfill_THEN_5C2_pilot_migration
STILL UNRESOLVED (placeholders for GPT/User)
species_exact_code = <PENDING GPT/USER — Opus proposal: 'information_unit_atom'>
species_entity_code = <PENDING GPT/USER — Opus proposal: 'SPE-IUA'>
species_display_name = <PENDING GPT/USER — Vietnamese>
species_prefix = <PENDING GPT/USER>
species_depth_in_taxonomy = <PENDING GPT/USER — Opus proposal: depth=1 under infrastructure root>
species_parent_id = <PENDING GPT/USER>
species_kg_metadata = <PENDING GPT/USER>
publication_authority_ref_value = <PENDING GPT/USER — birth-gate-required key; blocks 5C2 only, not 5C1>
A. Executive summary
Phase 5A dry-run rev6 closed with full evidence: 86 TAC logical units across 3 publications (DIEU-28=27, DIEU-32=23, DIEU-35=36), nesting depth=2, render_order contiguous, 0 address collisions, hash recipe divergence confirmed, birth gates satisfiable, 40 live species but none semantically law_unit.
GPT has locked D3 HYBRID: IU rows stay atomic, TAC parent-child in identity_profile JSON, parent_or_container_ref NULL, composition=atom, governance_role stays observed.
Rev2 key change: Phase 5C execution is SPLIT into two separate DRAFT prompts per GPT directive:
- 5C1: QT-005 species/mapping prep + QT-001 backfill (no TAC migration)
- 5C2: DIEU-35 pilot migration (requires 5C1 completed and accepted)
Rev1 prompt was NOT APPROVED because it: (a) merged all steps into one dispatch; (b) used hardcoded numeric gates from Phase 5A evidence (36, 12, 35); (c) contained SQL that could become execution hardcode; (d) left species identity and publication_authority_ref unresolved.
B. Accepted evidence from Phase 5A
(Unchanged from rev1 — all numbers from reports/p3d-pack1-phase5-tac-to-iu-migration-dryrun-report.md rev6 PASS.)
B.1 Source/target inventory
TAC logical units : 86 (Phase 5A evidence — NOT an executable gate)
TAC unit versions : 86
TAC publications : 3
Current IU rows : 12 (Phase 5A evidence — NOT an executable gate)
Current UV rows : 19
Address collisions : 0
B.2 Per-publication metrics (Phase 5A evidence — NOT executable gates)
| doc_code | version | members | section_types | depth | render_order |
|---|---|---|---|---|---|
| DIEU-28 | v2.0 | 27 | 7 | 2 | [0..26] |
| DIEU-32 | v1.1 | 23 | 8 | 2 | [0..22] |
| DIEU-35 | v5.2 | 36 | 12 | 2 | [0..35] |
All numbers above are Phase 5A evidence for reference. Executable prompts MUST re-derive counts live.
B.3–B.6
(Unchanged from rev1: nesting facts, hash/provenance, species landscape, birth gate readiness. Refer to rev1 §B.3–B.6.)
C. Why D3 hybrid — GPT LOCKED
(Reasoning unchanged from rev1 §C. Five GPT reasons + Opus exit-ramp rationale all accepted.)
D. D3 hybrid data model — GPT LOCKED
D.1 Sub-options (decision CLOSED)
| Sub-option | Status |
|---|---|
| D3a (identity_profile JSON) | GPT LOCKED as primary |
| D3b (universal_edges) | GPT LOCKED as deferred post-pilot |
| D3c (parent_or_container_ref pointer) | GPT REJECTED |
D.2 Identity_profile JSON keys (D3a hierarchy carrier)
Migration populates these keys. Format (column names are illustrative — Agent resolves live via registry):
-- PATTERN ONLY — Agent resolves all column names via semantic field registry before execution
information_unit.identity_profile = {
// Birth-gate-required keys (existing):
"title" : <from source version title>,
"owner_lookup_ref" : <from source owner field>,
"primary_section_type_ref" : <vocab.section_type.{live section_type value}>,
"publication_authority_ref" : <PENDING GPT/USER — hard blocker for 5C2>,
"publication_type_ref" : <vocab.publication_type.{live publication_type value}>,
// D3a hierarchy carrier keys (Phase 5B addition):
"tac_parent_anchor" : <source parent_id as text, or null for roots>,
"tac_parent_canonical_address" : <parent's canonical_address via lookup, or null>,
"publication_anchor" : <source doc_code>,
"publication_render_order" : <integer from source render_order>,
"tac_sort_order" : <integer from source sort_order>
}
D.3 parent_or_container_ref = NULL (GPT LOCKED)
All migrated IU rows have NULL parent_or_container_ref under D3. Hierarchy is in JSON only.
D.4 Render fidelity model
Nuxt Laws Page reconstructs tree from identity_profile JSON keys via recursive CTE on tac_parent_canonical_address. Verifiable in Phase 5C2 render-fidelity test.
E. Species/composition — GPT LOCKED (atom) + species identity PENDING
E.1 Composition = atom (LOCKED)
GPT accepts: UV/subordinates do not count as contained entities per Điều 0-B convention. All migrated IU rows have composition=atom under D3.
E.2 Species approach = new species (LOCKED direction, identity PENDING)
GPT accepts: new species likely needed. Opus recommends E.2.A (single species covering all IU unit_kinds). Exact species code, display name, prefix, taxonomy placement remain UNRESOLVED placeholders. Must follow QT-005 with live schema introspection — no hardcoded INSERT columns.
E.3 governance_role = keep observed (LOCKED)
No QT-005 promotion step during pilot. Promotion deferred to Phase 5E or later.
F. QT-005 species creation — 5C1 scope
QT-005 procedural steps (from Phase 4B addendum §F) execute in Phase 5C1. Species creation uses live schema introspection of entity_species table to discover required columns, constraints, and taxonomy tree. No hardcoded INSERT column list — Agent introspects information_schema.columns WHERE table_name='entity_species' and builds INSERT dynamically.
Species identity values come from GPT/User resolution of §STILL-UNRESOLVED placeholders.
Reversibility: DELETE by captured id.
G. QT-001 backfill — 5C1 scope
Backfill targets existing IU birth_registry rows where species_code IS NULL. Target count is LIVE-DERIVED (not hardcoded "12"). QT-001 5-step procedure with exact-key capture. Rollback via captured keys only.
Sequence within 5C1:
- Species seed (§F)
- Verify mapping exists
- Count backfill targets live
- Execute backfill with RETURNING capture
- Verify 0 NULL remaining
H. Pilot DIEU-35 — 5C2 scope
(Reasoning unchanged from rev1 §H.1–H.4.)
DIEU-35 is pilot because of highest diversity evidence. Member count, section_type count, and render_order metrics are LIVE-RE-DERIVED in 5C2, not hardcoded from Phase 5A.
I. Pilot migration algorithm — 5C2 scope (rev2: no hardcoded gates)
I.1 Preflight (read-only)
All preflight checks use LIVE queries. No hardcoded expected values.
PF-1 Species mapping exists: live count from species_collection_map for target_collection_primary → must be ≥ 1 with is_primary=true.
PF-2 Backfill complete: live count of NULL species in birth_registry for target_collection_primary → must be 0.
PF-3 Vocab coverage: for each DISTINCT section_type in source pilot set (live), verify vocab key exists.
PF-4 Publication_type vocab: verify pilot publication_type has vocab entry.
PF-5 fn_iu_create signature: live introspect pg_catalog.
PF-6 Address collision: live join source pilot set ↔ target IU canonical_address → must be 0.
PF-7 Identity_profile dry-construct: build one sample JSON, verify 5 birth-gate keys present.
I.2 Per-row migration
-- PATTERN ONLY — Agent resolves all column names via semantic field registry
source_set := live SELECT from source tables WHERE publication doc_code = pilot_doc_code
ORDER BY publication_render_order ASC;
source_count := count of source_set; -- live, NOT hardcoded
BEGIN;
FOR each row in source_set LOOP
-- Build identity_profile JSON (§D.2 pattern)
-- Call fn_iu_create with p_parent_ref := NULL (D3 LOCKED)
-- Capture returned IU id + UV id (shape verified on first row)
-- Patch identity_profile with D3a hierarchy keys
-- Patch unit_version.content_profile with TAC provenance (source_hashes.tac_v1)
-- Append ids to rollback capture arrays
END LOOP;
-- Post-loop assertions (ALL live-derived):
ASSERT inserted_iu_count = source_count;
ASSERT inserted_render_orders are contiguous with no duplicates (live min/max/distinct);
ASSERT 0 NULL species in birth_registry for inserted IU ids;
ASSERT fn_iu_verify_invariants returns OK for each inserted IU;
ASSERT content_hash = fn_content_hash(body) for each inserted UV;
-- Persist rollback capture to VPS log + KB report artifact
COMMIT;
I.3 Render fidelity (post-COMMIT, read-only)
Reconstruct tree from migrated IU identity_profile → compare with TAC source tree. Body text comparison. Render order comparison. Expected drift: 0.
J. Safety gates
(Unchanged from rev1 but with rev2 correction: all G-AS assertions use live counts, not hardcoded.)
K. Rollback model — GPT LOCKED (KB + VPS log)
K.1 Rollback capture persistence (LOCKED)
Agent writes captured key lists to:
- KB artifact:
reports/p3d-pack1-phase5c{1|2}-rollback-keys-<date>.md - VPS log:
/opt/incomex/logs/p3d-pack1-phase5c{1|2}-rollback-keys-<date>.log
No new DB control table in this pack.
K.2–K.4 Exact-key rollback patterns
(Unchanged from rev1 §K.1–K.4: per-step, per-pack rollback via captured keys. Pattern-matching rollback PROHIBITED.)
L. Post-implementation design requirement
(Unchanged from rev1 §L.)
M. Open questions / decisions status (rev2)
CLOSED (GPT LOCKED)
| # | Question | Decision | Rev2 action |
|---|---|---|---|
| M.1 | D3 sub-implementation | D3a primary, D3b deferred, D3c rejected | LOCKED |
| M.2 | Composition atom vs molecule | atom; UV ≠ contained entity | LOCKED |
| M.5 | 12 existing IU backfill | Yes, same species, via 5C1 QT-001 | LOCKED |
| M.6 | governance_role promotion | Defer, keep observed | LOCKED |
| M.7 | render_order carrier | identity_profile.publication_render_order | LOCKED |
| M.8 | universal_edges readiness | Deferred post-pilot | LOCKED |
STILL OPEN (require GPT/User resolution before dispatch)
| # | Question | Blocks | Opus proposal |
|---|---|---|---|
| M.3 | Species exact code/name/prefix/parent | 5C1 dispatch | information_unit_atom / SPE-IUA |
| M.4 | Single species E.2.A vs per-unit_kind E.2.B | 5C1 dispatch | E.2.A (single) |
| M.9 (new) | publication_authority_ref value/source | 5C2 dispatch only | Fixed scope constant or live lookup from source |
N. Phase boundaries (rev2: split 5C into 5C1 + 5C2)
| Sub-phase | Scope | Depends on | Allowed actions |
|---|---|---|---|
| 5B | THIS design (rev2) + 2 DRAFT prompts + report | Phase 5A PASS | Design only |
| 5C1 | Species seed (QT-005) + QT-001 backfill | 5B accepted + species identity locked by GPT | Agent dispatch; metadata INSERT/UPDATE only |
| 5C2 | DIEU-35 pilot migration (D3a hybrid) | 5C1 completed + publication_authority_ref locked | Agent dispatch; IU/UV migration |
| 5D | Full batch (DIEU-28 + DIEU-32) | 5C2 accepted | Reuse 5C2 algorithm |
| 5E | Post-implementation design + optional D3b edges + optional IU publication_member table | 5D accepted | Documentation + follow-on |
Phase 5B Design rev2 | D3a LOCKED | atom LOCKED | Split 5C1+5C2 | No hardcoded gates | Species identity PENDING | 2026-05-11