KB-4E52

P3D Pack 1 Phase 5C2 — DIEU-35 Hybrid Pilot Migration Prompt — DRAFT

19 min read Revision 1
p3dpack1phase5c2promptdraftmigrationd3-hybriddieu-35no-hardcodepilot

P3D Pack 1 Phase 5C2 — DIEU-35 Hybrid Pilot Migration Execution Prompt — DRAFT

Date: 2026-05-11 Author: Opus 4.7 Status: DRAFT — NOT DISPATCH-READY. Requires: (a) Phase 5C1 completed and accepted; (b) publication_authority_ref resolved (§9 placeholder #1); (c) GPT final review. Design: design/p3d-pack1-phase5b-hybrid-nesting-species-pilot-migration-design.md (rev2) GPT directive: directives/gpt-directive-opus-p3d-pack1-phase5b-rev2-split-prompts-2026-05-11.md Prerequisite: prompts/p3d-pack1-phase5c1-species-mapping-qt001-backfill-prompt-DRAFT.md — completed and accepted Scope: DIEU-35 pilot migration under D3a hybrid (atomic IU rows, hierarchy in identity_profile JSON, parent_or_container_ref NULL) Out of scope: Species creation, backfill (done in 5C1), DIEU-28, DIEU-32, universal_edges, IU publication_member table Mode (when dispatched): MIGRATION WRITE — IU/UV rows created via fn_iu_create; single-publication transaction DB target: VPS PostgreSQL (postgres container, db=directus, schema=public)


0. SCOPE BLOCK

# Target family
target_collection_primary       = 'information_unit'
target_collection_subordinate   = 'unit_version'
target_unit_kind                = 'law_unit'
target_vocab_key                = 'vocab.unit_kind.law_unit'

# Source family (READ-ONLY throughout)
source_table_logical            = 'tac_logical_unit'
source_table_version            = 'tac_unit_version'
source_table_publication        = 'tac_publication'
source_table_pub_member         = 'tac_publication_member'

# Pilot (design choice — live re-verify existence)
pilot_doc_code                  = 'DIEU-35'

# D3a hierarchy carrier (GPT LOCKED)
hierarchy_carrier               = 'identity_profile_json'
parent_or_container_ref_value   = NULL

# publication_authority_ref — BLOCKS DISPATCH:
publication_authority_ref_value = '<DECIDED_BY_GPT>'

No other literal identifiers. All column names → semantic field registry resolution + live introspection.


1. MISSION

Migrate all logical units belonging to pilot_doc_code from TAC source into native IU/UV rows. Each IU row is atomic (D3a); TAC parent-child hierarchy is preserved in identity_profile JSON keys. parent_or_container_ref stays NULL. IU content_hash is recomputed by fn_content_hash; original TAC hash preserved as provenance in unit_version.content_profile.source_hashes.tac_v1.

After migration: all pilot IU rows have non-NULL species (auto-assigned from 5C1 mapping), all invariants pass, render fidelity matches TAC source with 0 drift.


2. SEMANTIC FIELD REGISTRY (carried from Phase 5A rev6)

Agent loads the resolved field map from Phase 5A dry-run report (31 concepts, all RESOLVED except 1 informational AMBIGUOUS). The following concepts are used in this prompt:

address_field           → tac_logical_unit: canonical_address; information_unit: canonical_address
parent_ref              → tac_logical_unit: <resolved col>; information_unit: parent_or_container_ref
section_type            → tac_logical_unit: <resolved col>; information_unit: <resolved col>
sort_order              → tac_logical_unit: <resolved col>
content_body            → tac_unit_version: <resolved col>; unit_version: <resolved col>
content_hash            → tac_unit_version: <resolved col>; unit_version: <resolved col>
title_field             → tac_unit_version: <resolved col>; unit_version: <resolved col>
description_field       → tac_unit_version: <resolved col>; unit_version: <resolved col>
provenance_json_profile → unit_version: <resolved col> (jsonb)
publication_ref         → tac_publication_member: <resolved col>
logical_unit_ref        → tac_publication_member: <resolved col>
publication_render_order→ tac_publication_member: <resolved col>
species_identifier      → birth_registry: <resolved col>
collection_table_key    → birth_registry: <resolved col>

Agent MUST re-verify each concept→column resolution via information_schema.columns at prompt start, NOT assume cached names. If any resolution differs from Phase 5A report → ABORT and report the discrepancy.


3. GATE-0 — Read-only pre-state (must pass before any write)

G0-1  All source tables exist (source_table_logical, _version, _publication, _pub_member)
G0-2  All target tables exist (target_collection_primary, target_collection_subordinate)
G0-3  fn_iu_create exists with expected signature (live pg_catalog introspection)
G0-4  fn_content_hash exists
G0-5  fn_iu_verify_invariants exists
G0-6  pilot_doc_code exists in source_table_publication (live SELECT)
G0-7  Semantic field registry re-verified (§2) — all concept→column resolutions match Phase 5A

--- 5C1 completion gates ---
G0-8  species_collection_map has ≥1 is_primary=true row for target_collection_primary (live count)
G0-9  birth_registry has 0 NULL species rows for target_collection_primary (live count)
      If > 0 → 5C1 incomplete or rolled back → ABORT.

If any G0 fails → ABORT, report, return to GPT/User.


4. PREFLIGHT (read-only — live-derived, no hardcoded numbers)

PF-1  Source set discovery:
      SELECT count(*) AS source_count
      FROM <source_table_logical> lu
      JOIN <source_table_pub_member> pm ON pm.<logical_unit_ref> = lu.id
      JOIN <source_table_publication> p ON p.id = pm.<publication_ref>
      WHERE p.<doc_code col> = pilot_doc_code;

      → Store source_count. Report it.
      (Phase 5A evidence was 36 for DIEU-35 — reference only, NOT a gate.)

PF-2  Source section_type diversity (live):
      SELECT DISTINCT lu.<section_type col> AS st
      FROM same join
      WHERE p.<doc_code col> = pilot_doc_code;

      → Store distinct_section_types array. Report count.
      (Phase 5A evidence was 12 — reference only.)

PF-3  Vocab coverage for each live section_type:
      FOR each st in distinct_section_types:
        SELECT count(*) FROM <dot_config table>
        WHERE <config key col> = 'vocab.section_type.' || st;
        → must be ≥ 1.
      Also verify: 'vocab.publication_type.law' exists.
      Also verify: target_vocab_key exists.

PF-4  Address collision check (live):
      SELECT count(*) FROM <source_table_logical> lu
      JOIN <source_table_pub_member> pm ON pm.<logical_unit_ref> = lu.id
      JOIN <source_table_publication> p ON p.id = pm.<publication_ref>
      JOIN <target_collection_primary> iu ON iu.<address_field> = lu.<address_field>
      WHERE p.<doc_code col> = pilot_doc_code;
      → must be 0.

PF-5  Render_order metrics (live):
      SELECT min(pm.<publication_render_order>) AS ro_min,
             max(pm.<publication_render_order>) AS ro_max,
             count(DISTINCT pm.<publication_render_order>) AS ro_distinct
      FROM same join
      WHERE p.<doc_code col> = pilot_doc_code;
      → Store ro_min, ro_max, ro_distinct.
      ASSERT ro_distinct = source_count (contiguous, no duplicates).
      (Phase 5A evidence: min=0, max=35, distinct=36 — reference only.)

PF-6  Identity_profile dry-construct (one sample row):
      Build sample identity_profile JSON for the first source row (render_order = ro_min).
      Verify 5 birth-gate-required keys are present:
        title, owner_lookup_ref, primary_section_type_ref, publication_authority_ref, publication_type_ref.
      No INSERT — pure jsonb_build_object test.
      publication_authority_ref_value must be resolved from §9 placeholder #1.

PF-7  fn_iu_create return shape discovery:
      Call fn_iu_create_plan (the planner, not the writer) with the sample row's values.
      Inspect the returned jsonb to discover key names for IU id and UV id.
      Store key_iu_id and key_uv_id for use in the migration loop.
      If fn_iu_create_plan is not available or does not reveal shape → call fn_iu_create on first row
      inside an explicit SAVEPOINT and inspect RETURNING, then ROLLBACK TO SAVEPOINT if shape is wrong.

If any PF fails → ABORT, report, return to GPT/User.


5. MIGRATION TRANSACTION (single COMMIT for pilot_doc_code)

-- PATTERN ONLY — Agent resolves ALL column names via §2 registry + live introspection
-- Agent MUST NOT copy this SQL verbatim

source_set := SELECT lu.id, lu.<address_field>, lu.<section_type col>, lu.<sort_order col>,
                     lu.<owner col>, lu.<parent_ref col>,
                     uv.id AS tac_uv_id, uv.<title_field>, uv.<content_body>,
                     uv.<description_field>, uv.<content_hash col> AS tac_v1_hash,
                     pm.<publication_render_order> AS render_order,
                     p.id AS tac_pub_id, p.<doc_code col>, p.<publication_type col>
              FROM source tables (4-way join as in PF-1)
              WHERE p.<doc_code col> = pilot_doc_code
              ORDER BY pm.<publication_render_order> ASC;

capture_iu_ids := empty array;
capture_uv_ids := empty array;

BEGIN;

FOR each row in source_set LOOP

  -- A. Look up parent canonical_address (for D3a JSON)
  parent_canon := SELECT lu2.<address_field> FROM <source_table_logical> lu2
                  WHERE lu2.id = row.<parent_ref col>;  -- NULL for roots

  -- B. Build identity_profile JSON
  v_identity_profile := {
    "title"                       : row.<title_field>,
    "owner_lookup_ref"            : row.<owner col>,
    "primary_section_type_ref"    : 'vocab.section_type.' || row.<section_type col>,
    "publication_authority_ref"   : publication_authority_ref_value,   -- from §9
    "publication_type_ref"        : 'vocab.publication_type.' || row.<publication_type col>,
    "tac_parent_anchor"           : CASE WHEN row.<parent_ref col> IS NULL THEN null ELSE row.<parent_ref col>::text END,
    "tac_parent_canonical_address": parent_canon,
    "publication_anchor"          : row.<doc_code col>,
    "publication_render_order"    : row.render_order,
    "tac_sort_order"              : row.<sort_order col>
  };

  -- C. Call fn_iu_create (D3: p_parent_ref := NULL)
  v_result := fn_iu_create(
    p_canonical_address := row.<address_field>,
    p_title             := row.<title_field>,
    p_body              := row.<content_body>,
    p_actor             := 'phase5c2_migration_' || pilot_doc_code,
    p_unit_kind         := target_unit_kind,
    p_section_type      := row.<section_type col>,
    p_owner_ref         := row.<owner col>,
    p_publication_type  := row.<publication_type col>,
    p_parent_ref        := NULL
  );

  new_iu_id := (v_result->>key_iu_id)::uuid;   -- key_iu_id from PF-7
  new_uv_id := (v_result->>key_uv_id)::uuid;

  -- D. Merge identity_profile (preserve birth-gate defaults, add D3a keys)
  UPDATE <target_collection_primary>
    SET <identity_profile col> = COALESCE(<identity_profile col>, '{}'::jsonb) || v_identity_profile
    WHERE id = new_iu_id;

  -- E. Patch unit_version content_profile with TAC provenance
  UPDATE <target_collection_subordinate>
    SET <provenance_json_profile col> = COALESCE(<provenance_json_profile col>, '{}'::jsonb) || jsonb_build_object(
      'source_hashes', jsonb_build_object('tac_v1', row.tac_v1_hash),
      'source_refs',   jsonb_build_object(
        'tac_logical_unit_id', row.id,
        'tac_unit_version_id', row.tac_uv_id,
        'tac_publication_id' , row.tac_pub_id,
        'tac_doc_code'       , row.<doc_code col>,
        'tac_render_order'   , row.render_order
      )
    )
    WHERE id = new_uv_id;

  -- F. Capture for rollback
  capture_iu_ids := append(capture_iu_ids, new_iu_id);
  capture_uv_ids := append(capture_uv_ids, new_uv_id);

END LOOP;

-- G. POST-LOOP ASSERTIONS (all live-derived)

inserted_count := array_length(capture_iu_ids);
ASSERT inserted_count = source_count;   -- both live

-- Render_order contiguous check on inserted rows:
SELECT min((<identity_profile col>->>'publication_render_order')::int) AS iro_min,
       max((<identity_profile col>->>'publication_render_order')::int) AS iro_max,
       count(DISTINCT (<identity_profile col>->>'publication_render_order')::int) AS iro_distinct
FROM <target_collection_primary>
WHERE id = ANY(capture_iu_ids);
ASSERT iro_distinct = inserted_count;
ASSERT iro_min = ro_min;   -- from PF-5 (live)
ASSERT iro_max = ro_max;

-- Birth species non-null for all inserted:
SELECT count(*) FROM <birth_registry>
WHERE <collection_table_key col> = target_collection_primary
  AND <birth_entity col>::uuid = ANY(capture_iu_ids)
  AND <species_identifier col> IS NULL;
→ must be 0.

-- Invariants pass for all inserted:
FOR each addr in (SELECT <address_field> FROM <target_collection_primary> WHERE id = ANY(capture_iu_ids))
  ASSERT fn_iu_verify_invariants(addr)->>'status' = 'OK';
END LOOP;

-- Content hash correct for all inserted UVs:
SELECT count(*) FROM <target_collection_subordinate>
WHERE id = ANY(capture_uv_ids)
  AND <content_hash col> <> fn_content_hash(<content_body col>);
→ must be 0.

-- H. Persist rollback capture BEFORE COMMIT
-- Write capture_iu_ids + capture_uv_ids to:
--   KB: reports/p3d-pack1-phase5c2-rollback-keys-<date>.md
--   VPS: /opt/incomex/logs/p3d-pack1-phase5c2-rollback-keys-<date>.log

COMMIT;

6. RENDER FIDELITY CHECK (post-COMMIT, read-only)

RF-1  Tree structure:
      Reconstruct DIEU-35 tree from migrated IU rows by joining on
      identity_profile->>'tac_parent_canonical_address' = <address_field>.
      Compare parent→child pairs with TAC source tree.
      Expected: 0 mismatched pairs, 0 missing, 0 extra.

RF-2  Body content:
      For each migrated IU+UV, compare unit_version.<content_body> to
      corresponding tac_unit_version.<content_body>.
      Expected: 100% match.

RF-3  Render order:
      SELECT <address_field> FROM <target_collection_primary>
      WHERE (identity_profile->>'publication_anchor') = pilot_doc_code
      ORDER BY (identity_profile->>'publication_render_order')::int;

      Compare sequence against:
      SELECT lu.<address_field> FROM source 4-way join
      WHERE p.<doc_code col> = pilot_doc_code
      ORDER BY pm.<publication_render_order>;

      Expected: identical sequence.

RF-4  Source untouched:
      Re-count all 4 source tables. Compare to GATE-0 pre-state.
      Expected: identical counts.

If RF-1..4 PASS → pilot ACCEPTED; report; await GPT/User confirmation for Phase 5D.
If any RF fails → execute post-COMMIT rollback (§7.B).

7. ROLLBACK MODEL

7.A In-transaction rollback

Any assertion failure inside the transaction → automatic ROLLBACK. Pre-state restored.

7.B Post-COMMIT rollback (RF failure or GPT/User reject)

-- PATTERN ONLY — column names from registry

BEGIN;
DELETE FROM <birth_registry>
  WHERE <collection_table_key col> = target_collection_primary
    AND <birth_entity col>::uuid = ANY(<capture_iu_ids>);
DELETE FROM <target_collection_subordinate> WHERE id = ANY(<capture_uv_ids>);
DELETE FROM <target_collection_primary> WHERE id = ANY(<capture_iu_ids>);
COMMIT;

Pattern-matching rollback (e.g., WHERE publication_anchor='DIEU-35') is PROHIBITED. Exact captured keys only.

7.C Source tables

NEVER touched. No DELETE/UPDATE on source_table_* at any point.

7.D Capture persistence (GPT LOCKED)

KB artifact + VPS log. No new DB control table.


8. RÀNG BUỘC LUẬT (binding)

Điều 0-B     composition = atom under D3 (hierarchy in JSON, not containment)
Điều 0-G     species auto-assigned at IU birth (mapping from 5C1)
Điều 29      IU rows join existing classification system
Điều 44 §A   No code change; metadata-driven migration
§0-AU        All column names via semantic field registry / live introspection; all counts live
§FIX_GỐC    fn_iu_create is the canonical IU writer; migration uses it, not raw INSERT
§0-AF        Evidence = production-style live verification
NT-13        PG-first
NT-14        Prompt structured for executable dispatch (after §9 resolved + GPT review)

9. PLACEHOLDERS REQUIRING RESOLUTION BEFORE DISPATCH

#1  publication_authority_ref_value
    Birth-gate layer1 requires identity_profile key "publication_authority_ref".
    TAC publications don't expose an obvious authority field.
    Options:
      A. Fixed scope constant, e.g., 'vocab.publication_authority.incomex_council'
         (requires GPT/User to verify that this vocab entry exists or will be created)
      B. Live lookup from TAC publication metadata (if an authority-like column exists)
      C. Derive from doc_code or owner field
    Status: PENDING GPT/USER.

    NOTE: This placeholder blocks 5C2 dispatch ONLY. 5C1 (species/backfill) is independent.

When #1 is resolved, Opus produces prompt rev1 (dispatch-ready).


10. REPORT REQUIREMENTS (when dispatched, future)

Save to: knowledge/dev/laws/dieu44-trien-khai/reports/p3d-pack1-phase5c2-dieu35-pilot-migration-report.md

Required sections:

  • A. Status flags
  • B. GATE-0 + PF live evidence
  • C. Registry re-verification results
  • D. Migration: source_count, inserted_count, capture lists
  • E. Post-loop assertions (all PASS/FAIL)
  • F. Render fidelity (RF-1..4)
  • G. Source untouched verification
  • H. Rollback key artifact paths
  • I. Open issues / next phase (5D)

Status flag template:

phase5c2_status                  = PASS | PARTIAL | BLOCKED
pilot_doc_code                   = DIEU-35
source_count                     = <live>
inserted_iu_count                = <live>
inserted_uv_count                = <live>
render_order_contiguous          = true | false
birth_species_all_non_null       = true | false
invariants_all_ok                = true | false
content_hash_all_correct         = true | false
render_fidelity_drift            = 0 | <count>
source_tables_untouched          = true | false
rollback_keys_persisted          = true | false
post_implementation_design_due   = true
phase5d_unblocked                = true | false

11. SAFETY SUMMARY

Writes: IU rows + UV rows + auto-birth rows (via fn_iu_create).
No direct INSERT to information_unit or unit_version — fn_iu_create handles all.
identity_profile and content_profile patched via UPDATE after fn_iu_create returns.
Single transaction for pilot_doc_code; ROLLBACK on any assertion failure.
Post-COMMIT rollback via exact captured keys.
TAC source: NEVER written.
All column names from registry + live introspection.
All counts live-derived.
No hardcoded numeric gates.

Phase 5C2 Execution Prompt — DRAFT | D3a hybrid | DIEU-35 pilot | publication_authority_ref PENDING | No hardcoded gates | 2026-05-11

Back to Knowledge Hub knowledge/dev/laws/dieu44-trien-khai/prompts/p3d-pack1-phase5c2-dieu35-hybrid-pilot-migration-prompt-DRAFT.md