P3D Pack 1 Phase 5C2 — DIEU-35 Hybrid Pilot Migration Prompt — DRAFT
P3D Pack 1 Phase 5C2 — DIEU-35 Hybrid Pilot Migration Execution Prompt — DRAFT
Date: 2026-05-11 Author: Opus 4.7 Status: DRAFT — NOT DISPATCH-READY. Requires: (a) Phase 5C1 completed and accepted; (b) publication_authority_ref resolved (§9 placeholder #1); (c) GPT final review. Design:
design/p3d-pack1-phase5b-hybrid-nesting-species-pilot-migration-design.md(rev2) GPT directive:directives/gpt-directive-opus-p3d-pack1-phase5b-rev2-split-prompts-2026-05-11.mdPrerequisite:prompts/p3d-pack1-phase5c1-species-mapping-qt001-backfill-prompt-DRAFT.md— completed and accepted Scope: DIEU-35 pilot migration under D3a hybrid (atomic IU rows, hierarchy in identity_profile JSON, parent_or_container_ref NULL) Out of scope: Species creation, backfill (done in 5C1), DIEU-28, DIEU-32, universal_edges, IU publication_member table Mode (when dispatched): MIGRATION WRITE — IU/UV rows created via fn_iu_create; single-publication transaction DB target: VPS PostgreSQL (postgres container, db=directus, schema=public)
0. SCOPE BLOCK
# Target family
target_collection_primary = 'information_unit'
target_collection_subordinate = 'unit_version'
target_unit_kind = 'law_unit'
target_vocab_key = 'vocab.unit_kind.law_unit'
# Source family (READ-ONLY throughout)
source_table_logical = 'tac_logical_unit'
source_table_version = 'tac_unit_version'
source_table_publication = 'tac_publication'
source_table_pub_member = 'tac_publication_member'
# Pilot (design choice — live re-verify existence)
pilot_doc_code = 'DIEU-35'
# D3a hierarchy carrier (GPT LOCKED)
hierarchy_carrier = 'identity_profile_json'
parent_or_container_ref_value = NULL
# publication_authority_ref — BLOCKS DISPATCH:
publication_authority_ref_value = '<DECIDED_BY_GPT>'
No other literal identifiers. All column names → semantic field registry resolution + live introspection.
1. MISSION
Migrate all logical units belonging to pilot_doc_code from TAC source into native IU/UV rows. Each IU row is atomic (D3a); TAC parent-child hierarchy is preserved in identity_profile JSON keys. parent_or_container_ref stays NULL. IU content_hash is recomputed by fn_content_hash; original TAC hash preserved as provenance in unit_version.content_profile.source_hashes.tac_v1.
After migration: all pilot IU rows have non-NULL species (auto-assigned from 5C1 mapping), all invariants pass, render fidelity matches TAC source with 0 drift.
2. SEMANTIC FIELD REGISTRY (carried from Phase 5A rev6)
Agent loads the resolved field map from Phase 5A dry-run report (31 concepts, all RESOLVED except 1 informational AMBIGUOUS). The following concepts are used in this prompt:
address_field → tac_logical_unit: canonical_address; information_unit: canonical_address
parent_ref → tac_logical_unit: <resolved col>; information_unit: parent_or_container_ref
section_type → tac_logical_unit: <resolved col>; information_unit: <resolved col>
sort_order → tac_logical_unit: <resolved col>
content_body → tac_unit_version: <resolved col>; unit_version: <resolved col>
content_hash → tac_unit_version: <resolved col>; unit_version: <resolved col>
title_field → tac_unit_version: <resolved col>; unit_version: <resolved col>
description_field → tac_unit_version: <resolved col>; unit_version: <resolved col>
provenance_json_profile → unit_version: <resolved col> (jsonb)
publication_ref → tac_publication_member: <resolved col>
logical_unit_ref → tac_publication_member: <resolved col>
publication_render_order→ tac_publication_member: <resolved col>
species_identifier → birth_registry: <resolved col>
collection_table_key → birth_registry: <resolved col>
Agent MUST re-verify each concept→column resolution via information_schema.columns at prompt start, NOT assume cached names. If any resolution differs from Phase 5A report → ABORT and report the discrepancy.
3. GATE-0 — Read-only pre-state (must pass before any write)
G0-1 All source tables exist (source_table_logical, _version, _publication, _pub_member)
G0-2 All target tables exist (target_collection_primary, target_collection_subordinate)
G0-3 fn_iu_create exists with expected signature (live pg_catalog introspection)
G0-4 fn_content_hash exists
G0-5 fn_iu_verify_invariants exists
G0-6 pilot_doc_code exists in source_table_publication (live SELECT)
G0-7 Semantic field registry re-verified (§2) — all concept→column resolutions match Phase 5A
--- 5C1 completion gates ---
G0-8 species_collection_map has ≥1 is_primary=true row for target_collection_primary (live count)
G0-9 birth_registry has 0 NULL species rows for target_collection_primary (live count)
If > 0 → 5C1 incomplete or rolled back → ABORT.
If any G0 fails → ABORT, report, return to GPT/User.
4. PREFLIGHT (read-only — live-derived, no hardcoded numbers)
PF-1 Source set discovery:
SELECT count(*) AS source_count
FROM <source_table_logical> lu
JOIN <source_table_pub_member> pm ON pm.<logical_unit_ref> = lu.id
JOIN <source_table_publication> p ON p.id = pm.<publication_ref>
WHERE p.<doc_code col> = pilot_doc_code;
→ Store source_count. Report it.
(Phase 5A evidence was 36 for DIEU-35 — reference only, NOT a gate.)
PF-2 Source section_type diversity (live):
SELECT DISTINCT lu.<section_type col> AS st
FROM same join
WHERE p.<doc_code col> = pilot_doc_code;
→ Store distinct_section_types array. Report count.
(Phase 5A evidence was 12 — reference only.)
PF-3 Vocab coverage for each live section_type:
FOR each st in distinct_section_types:
SELECT count(*) FROM <dot_config table>
WHERE <config key col> = 'vocab.section_type.' || st;
→ must be ≥ 1.
Also verify: 'vocab.publication_type.law' exists.
Also verify: target_vocab_key exists.
PF-4 Address collision check (live):
SELECT count(*) FROM <source_table_logical> lu
JOIN <source_table_pub_member> pm ON pm.<logical_unit_ref> = lu.id
JOIN <source_table_publication> p ON p.id = pm.<publication_ref>
JOIN <target_collection_primary> iu ON iu.<address_field> = lu.<address_field>
WHERE p.<doc_code col> = pilot_doc_code;
→ must be 0.
PF-5 Render_order metrics (live):
SELECT min(pm.<publication_render_order>) AS ro_min,
max(pm.<publication_render_order>) AS ro_max,
count(DISTINCT pm.<publication_render_order>) AS ro_distinct
FROM same join
WHERE p.<doc_code col> = pilot_doc_code;
→ Store ro_min, ro_max, ro_distinct.
ASSERT ro_distinct = source_count (contiguous, no duplicates).
(Phase 5A evidence: min=0, max=35, distinct=36 — reference only.)
PF-6 Identity_profile dry-construct (one sample row):
Build sample identity_profile JSON for the first source row (render_order = ro_min).
Verify 5 birth-gate-required keys are present:
title, owner_lookup_ref, primary_section_type_ref, publication_authority_ref, publication_type_ref.
No INSERT — pure jsonb_build_object test.
publication_authority_ref_value must be resolved from §9 placeholder #1.
PF-7 fn_iu_create return shape discovery:
Call fn_iu_create_plan (the planner, not the writer) with the sample row's values.
Inspect the returned jsonb to discover key names for IU id and UV id.
Store key_iu_id and key_uv_id for use in the migration loop.
If fn_iu_create_plan is not available or does not reveal shape → call fn_iu_create on first row
inside an explicit SAVEPOINT and inspect RETURNING, then ROLLBACK TO SAVEPOINT if shape is wrong.
If any PF fails → ABORT, report, return to GPT/User.
5. MIGRATION TRANSACTION (single COMMIT for pilot_doc_code)
-- PATTERN ONLY — Agent resolves ALL column names via §2 registry + live introspection
-- Agent MUST NOT copy this SQL verbatim
source_set := SELECT lu.id, lu.<address_field>, lu.<section_type col>, lu.<sort_order col>,
lu.<owner col>, lu.<parent_ref col>,
uv.id AS tac_uv_id, uv.<title_field>, uv.<content_body>,
uv.<description_field>, uv.<content_hash col> AS tac_v1_hash,
pm.<publication_render_order> AS render_order,
p.id AS tac_pub_id, p.<doc_code col>, p.<publication_type col>
FROM source tables (4-way join as in PF-1)
WHERE p.<doc_code col> = pilot_doc_code
ORDER BY pm.<publication_render_order> ASC;
capture_iu_ids := empty array;
capture_uv_ids := empty array;
BEGIN;
FOR each row in source_set LOOP
-- A. Look up parent canonical_address (for D3a JSON)
parent_canon := SELECT lu2.<address_field> FROM <source_table_logical> lu2
WHERE lu2.id = row.<parent_ref col>; -- NULL for roots
-- B. Build identity_profile JSON
v_identity_profile := {
"title" : row.<title_field>,
"owner_lookup_ref" : row.<owner col>,
"primary_section_type_ref" : 'vocab.section_type.' || row.<section_type col>,
"publication_authority_ref" : publication_authority_ref_value, -- from §9
"publication_type_ref" : 'vocab.publication_type.' || row.<publication_type col>,
"tac_parent_anchor" : CASE WHEN row.<parent_ref col> IS NULL THEN null ELSE row.<parent_ref col>::text END,
"tac_parent_canonical_address": parent_canon,
"publication_anchor" : row.<doc_code col>,
"publication_render_order" : row.render_order,
"tac_sort_order" : row.<sort_order col>
};
-- C. Call fn_iu_create (D3: p_parent_ref := NULL)
v_result := fn_iu_create(
p_canonical_address := row.<address_field>,
p_title := row.<title_field>,
p_body := row.<content_body>,
p_actor := 'phase5c2_migration_' || pilot_doc_code,
p_unit_kind := target_unit_kind,
p_section_type := row.<section_type col>,
p_owner_ref := row.<owner col>,
p_publication_type := row.<publication_type col>,
p_parent_ref := NULL
);
new_iu_id := (v_result->>key_iu_id)::uuid; -- key_iu_id from PF-7
new_uv_id := (v_result->>key_uv_id)::uuid;
-- D. Merge identity_profile (preserve birth-gate defaults, add D3a keys)
UPDATE <target_collection_primary>
SET <identity_profile col> = COALESCE(<identity_profile col>, '{}'::jsonb) || v_identity_profile
WHERE id = new_iu_id;
-- E. Patch unit_version content_profile with TAC provenance
UPDATE <target_collection_subordinate>
SET <provenance_json_profile col> = COALESCE(<provenance_json_profile col>, '{}'::jsonb) || jsonb_build_object(
'source_hashes', jsonb_build_object('tac_v1', row.tac_v1_hash),
'source_refs', jsonb_build_object(
'tac_logical_unit_id', row.id,
'tac_unit_version_id', row.tac_uv_id,
'tac_publication_id' , row.tac_pub_id,
'tac_doc_code' , row.<doc_code col>,
'tac_render_order' , row.render_order
)
)
WHERE id = new_uv_id;
-- F. Capture for rollback
capture_iu_ids := append(capture_iu_ids, new_iu_id);
capture_uv_ids := append(capture_uv_ids, new_uv_id);
END LOOP;
-- G. POST-LOOP ASSERTIONS (all live-derived)
inserted_count := array_length(capture_iu_ids);
ASSERT inserted_count = source_count; -- both live
-- Render_order contiguous check on inserted rows:
SELECT min((<identity_profile col>->>'publication_render_order')::int) AS iro_min,
max((<identity_profile col>->>'publication_render_order')::int) AS iro_max,
count(DISTINCT (<identity_profile col>->>'publication_render_order')::int) AS iro_distinct
FROM <target_collection_primary>
WHERE id = ANY(capture_iu_ids);
ASSERT iro_distinct = inserted_count;
ASSERT iro_min = ro_min; -- from PF-5 (live)
ASSERT iro_max = ro_max;
-- Birth species non-null for all inserted:
SELECT count(*) FROM <birth_registry>
WHERE <collection_table_key col> = target_collection_primary
AND <birth_entity col>::uuid = ANY(capture_iu_ids)
AND <species_identifier col> IS NULL;
→ must be 0.
-- Invariants pass for all inserted:
FOR each addr in (SELECT <address_field> FROM <target_collection_primary> WHERE id = ANY(capture_iu_ids))
ASSERT fn_iu_verify_invariants(addr)->>'status' = 'OK';
END LOOP;
-- Content hash correct for all inserted UVs:
SELECT count(*) FROM <target_collection_subordinate>
WHERE id = ANY(capture_uv_ids)
AND <content_hash col> <> fn_content_hash(<content_body col>);
→ must be 0.
-- H. Persist rollback capture BEFORE COMMIT
-- Write capture_iu_ids + capture_uv_ids to:
-- KB: reports/p3d-pack1-phase5c2-rollback-keys-<date>.md
-- VPS: /opt/incomex/logs/p3d-pack1-phase5c2-rollback-keys-<date>.log
COMMIT;
6. RENDER FIDELITY CHECK (post-COMMIT, read-only)
RF-1 Tree structure:
Reconstruct DIEU-35 tree from migrated IU rows by joining on
identity_profile->>'tac_parent_canonical_address' = <address_field>.
Compare parent→child pairs with TAC source tree.
Expected: 0 mismatched pairs, 0 missing, 0 extra.
RF-2 Body content:
For each migrated IU+UV, compare unit_version.<content_body> to
corresponding tac_unit_version.<content_body>.
Expected: 100% match.
RF-3 Render order:
SELECT <address_field> FROM <target_collection_primary>
WHERE (identity_profile->>'publication_anchor') = pilot_doc_code
ORDER BY (identity_profile->>'publication_render_order')::int;
Compare sequence against:
SELECT lu.<address_field> FROM source 4-way join
WHERE p.<doc_code col> = pilot_doc_code
ORDER BY pm.<publication_render_order>;
Expected: identical sequence.
RF-4 Source untouched:
Re-count all 4 source tables. Compare to GATE-0 pre-state.
Expected: identical counts.
If RF-1..4 PASS → pilot ACCEPTED; report; await GPT/User confirmation for Phase 5D.
If any RF fails → execute post-COMMIT rollback (§7.B).
7. ROLLBACK MODEL
7.A In-transaction rollback
Any assertion failure inside the transaction → automatic ROLLBACK. Pre-state restored.
7.B Post-COMMIT rollback (RF failure or GPT/User reject)
-- PATTERN ONLY — column names from registry
BEGIN;
DELETE FROM <birth_registry>
WHERE <collection_table_key col> = target_collection_primary
AND <birth_entity col>::uuid = ANY(<capture_iu_ids>);
DELETE FROM <target_collection_subordinate> WHERE id = ANY(<capture_uv_ids>);
DELETE FROM <target_collection_primary> WHERE id = ANY(<capture_iu_ids>);
COMMIT;
Pattern-matching rollback (e.g., WHERE publication_anchor='DIEU-35') is PROHIBITED. Exact captured keys only.
7.C Source tables
NEVER touched. No DELETE/UPDATE on source_table_* at any point.
7.D Capture persistence (GPT LOCKED)
KB artifact + VPS log. No new DB control table.
8. RÀNG BUỘC LUẬT (binding)
Điều 0-B composition = atom under D3 (hierarchy in JSON, not containment)
Điều 0-G species auto-assigned at IU birth (mapping from 5C1)
Điều 29 IU rows join existing classification system
Điều 44 §A No code change; metadata-driven migration
§0-AU All column names via semantic field registry / live introspection; all counts live
§FIX_GỐC fn_iu_create is the canonical IU writer; migration uses it, not raw INSERT
§0-AF Evidence = production-style live verification
NT-13 PG-first
NT-14 Prompt structured for executable dispatch (after §9 resolved + GPT review)
9. PLACEHOLDERS REQUIRING RESOLUTION BEFORE DISPATCH
#1 publication_authority_ref_value
Birth-gate layer1 requires identity_profile key "publication_authority_ref".
TAC publications don't expose an obvious authority field.
Options:
A. Fixed scope constant, e.g., 'vocab.publication_authority.incomex_council'
(requires GPT/User to verify that this vocab entry exists or will be created)
B. Live lookup from TAC publication metadata (if an authority-like column exists)
C. Derive from doc_code or owner field
Status: PENDING GPT/USER.
NOTE: This placeholder blocks 5C2 dispatch ONLY. 5C1 (species/backfill) is independent.
When #1 is resolved, Opus produces prompt rev1 (dispatch-ready).
10. REPORT REQUIREMENTS (when dispatched, future)
Save to: knowledge/dev/laws/dieu44-trien-khai/reports/p3d-pack1-phase5c2-dieu35-pilot-migration-report.md
Required sections:
- A. Status flags
- B. GATE-0 + PF live evidence
- C. Registry re-verification results
- D. Migration: source_count, inserted_count, capture lists
- E. Post-loop assertions (all PASS/FAIL)
- F. Render fidelity (RF-1..4)
- G. Source untouched verification
- H. Rollback key artifact paths
- I. Open issues / next phase (5D)
Status flag template:
phase5c2_status = PASS | PARTIAL | BLOCKED
pilot_doc_code = DIEU-35
source_count = <live>
inserted_iu_count = <live>
inserted_uv_count = <live>
render_order_contiguous = true | false
birth_species_all_non_null = true | false
invariants_all_ok = true | false
content_hash_all_correct = true | false
render_fidelity_drift = 0 | <count>
source_tables_untouched = true | false
rollback_keys_persisted = true | false
post_implementation_design_due = true
phase5d_unblocked = true | false
11. SAFETY SUMMARY
Writes: IU rows + UV rows + auto-birth rows (via fn_iu_create).
No direct INSERT to information_unit or unit_version — fn_iu_create handles all.
identity_profile and content_profile patched via UPDATE after fn_iu_create returns.
Single transaction for pilot_doc_code; ROLLBACK on any assertion failure.
Post-COMMIT rollback via exact captured keys.
TAC source: NEVER written.
All column names from registry + live introspection.
All counts live-derived.
No hardcoded numeric gates.
Phase 5C2 Execution Prompt — DRAFT | D3a hybrid | DIEU-35 pilot | publication_authority_ref PENDING | No hardcoded gates | 2026-05-11