P3D Pack 1 Phase 5B → 5C — Hybrid Pilot Migration Execution Prompt — DRAFT
P3D Pack 1 Phase 5B → 5C — Hybrid Pilot Migration Execution Prompt — DRAFT
Date: 2026-05-11 Author: Opus 4.7 (drafter) Status: DRAFT — NOT DISPATCH-READY. Requires GPT/User review and resolution of §10 placeholders before any Agent dispatch. Design reference:
design/p3d-pack1-phase5b-hybrid-nesting-species-pilot-migration-design.mdStrategy: D3 HYBRID (per GPT review 2026-05-11) — D3a (JSON in identity_profile) primary, D3b (universal_edges) deferred Pilot: DIEU-35 Mode (when dispatched): MIGRATION — writes allowed under transaction guards; per-publication COMMIT DB target: VPS PostgreSQL (postgres container, db=directus, schema=public)
0. SCOPE BLOCK (no other literal identifiers as selectors)
# Target family (Pack 1)
target_collection_primary = 'information_unit'
target_collection_subordinate = 'unit_version'
target_unit_kind = 'law_unit'
target_vocab_key = 'vocab.unit_kind.law_unit'
# Source family (TAC) — read-only throughout
source_table_logical = 'tac_logical_unit'
source_table_version = 'tac_unit_version'
source_table_publication = 'tac_publication'
source_table_pub_member = 'tac_publication_member'
# Pilot publication
pilot_doc_code = 'DIEU-35' # design choice; live re-verify exists in tac_publication
# Species (PROPOSAL — locked by GPT/User in §10)
proposed_species_code = '<DECIDED_BY_GPT>' # Opus proposal: 'information_unit_atom'
proposed_species_entity_code = '<DECIDED_BY_GPT>' # Opus proposal: 'SPE-IUA'
proposed_composition_level = 'atom' # Opus rationale: D3 hybrid + Điều 0-B literal
proposed_species_management = 'observed' # matches IU governance_role
All column names, counts, vocab keys, species codes, composition values → derive live via the semantic field registry. No hardcoded selector outside the SCOPE BLOCK above.
1. GATE-0 — Read-only existence checks (must pass before any write)
GATE-0.A All Pack 1 tables exist (Phase 5A P1–P8)
GATE-0.B All Pack 1 functions exist (Phase 5A P9–P14)
GATE-0.C Semantic field registry (rev6, 31 concepts) loaded and resolution map complete
GATE-0.D Source row counts and per-publication metrics for DIEU-35 match Phase 5A evidence:
tac_logical_unit total = 86; DIEU-35 members = 36; section_types = 12; depth = 2
(Live re-count, not cached.)
GATE-0.E Canonical address collision count for DIEU-35 = 0 (re-verified live).
GATE-0.F No write/mutation has occurred up to this point.
If any GATE-0 check fails → ABORT, report, return to GPT/User.
2. PREFLIGHT (still read-only; precedes Step 1 of each phase action)
PF-1 §F species seed prerequisites (BEFORE Step 1):
entity_species does NOT yet contain proposed_species_entity_code (no pre-existing collision)
species_collection_map has 0 rows for collection_name = target_collection_primary
PF-2 §G backfill prerequisites (BEFORE Step 2):
Step 1 (species seed) COMMITTED;
species_collection_map has exactly 1 is_primary=true row for target_collection_primary;
birth_registry rows for target_collection_primary with species_code IS NULL = 12 (live count).
PF-3 Pilot migration prerequisites (BEFORE Step 3):
Step 2 (backfill) COMMITTED;
birth_registry rows with species_code IS NULL for target_collection_primary = 0;
vocab.section_type.* exists for each of the 12 distinct section_types in DIEU-35 (live);
vocab.publication_type.law exists in dot_config;
vocab.unit_kind.law_unit exists (Phase 4 confirmed);
fn_iu_create signature matches Phase 5A G8 (9 args including p_parent_ref uuid).
PF-4 Identity_profile dry-construct test:
Build one sample identity_profile JSON for DIEU-35's first row (render_order=0).
Confirm the 5 birth-gate-required keys are present
(title, owner_lookup_ref, primary_section_type_ref, publication_authority_ref, publication_type_ref).
No INSERT; pure jsonb_build_object test.
PF-5 No-collision re-verify (live, immediately before Step 3):
SELECT count(*) FROM tac_logical_unit lu
JOIN tac_publication_member pm ON pm.<logical_unit_ref column> = lu.id
JOIN tac_publication p ON p.id = pm.<publication_ref column>
JOIN information_unit iu ON iu.<address_field> = lu.<address_field>
WHERE p.<doc_code column> = pilot_doc_code;
Expected: 0.
(All column names from resolved field map, not hardcoded.)
If any PF-* fails → ABORT, report, do NOT proceed to the corresponding step.
3. STEP 1 — Species + mapping seed (QT-005 step 1–3) — requires §10 placeholder resolution
GOAL: Create the new species <proposed_species_entity_code> and primary mapping for target_collection_primary, so that subsequent IU INSERTs auto-assign species. No row of information_unit is touched in this step.
Operations (metadata-only, no DDL):
1. INSERT INTO entity_species
(species_code, display_name, composition_level, management_mode, status, depth, parent_id, prefix, kg_metadata, ...)
VALUES
(<decided_by_GPT>, <decided_by_GPT>, proposed_composition_level, proposed_species_management, 'active', <decided_by_GPT>, <decided_by_GPT>, <decided_by_GPT>, <decided_by_GPT>, ...)
RETURNING id; -- capture for rollback
2. INSERT INTO species_collection_map
(collection_name, species_code, is_primary, discriminator_field, discriminator_value, discriminator_operator, discriminator_config, ...)
VALUES
(target_collection_primary, <proposed_species_entity_code or species_code variant>, true, NULL, NULL, NULL, NULL, ...)
RETURNING id; -- capture for rollback
3. (Optional, per §F.3 of design and §10 placeholder #2.) Decide whether to seed species_collection_map row for target_collection_subordinate.
Opus default: do NOT seed; UV is subordinate with 0 birth rows.
PASS criteria for Step 1:
- Both INSERTs succeed.
- SELECT count(*) FROM species_collection_map WHERE collection_name = target_collection_primary AND is_primary = true → 1.
- No row in information_unit is modified.
Rollback for Step 1: DELETE rows by captured ids. See §6.
4. STEP 2 — QT-001 backfill of 12 existing IU birth_registry rows
GOAL: Backfill species_code and composition_level for the 12 existing IU birth records that were stranded NULL before §F seed.
Operations (metadata-only):
Count check (live):
SELECT count(*) FROM birth_registry
WHERE collection_name = target_collection_primary AND species_code IS NULL;
Expected: 12.
Backfill:
Prefer DOT tool if present: dot-birth-backfill --collection=information_unit
Fallback governed SQL:
WITH backfill AS (
UPDATE birth_registry
SET species_code = <decided_by_GPT>,
composition_level = proposed_composition_level
WHERE collection_name = target_collection_primary
AND species_code IS NULL
RETURNING entity_code
)
INSERT INTO <a write-once rollback capture table or transient artifact>
SELECT entity_code FROM backfill;
-- The RETURNING capture is mandatory for exact-key rollback.
PASS criteria for Step 2:
- Backfill RETURNING captured 12 exact entity_code values.
- Re-count: 0 NULL species rows for target_collection_primary.
- No row in information_unit is modified.
- Did NOT touch any row outside collection_name = target_collection_primary.
Rollback for Step 2: UPDATE the exact captured entity_codes back to NULL. See §6.
5. STEP 3 — DIEU-35 pilot migration (36 IU + 36 UV rows)
GOAL: Migrate all 36 logical units belonging to pilot_doc_code into native IU/UV rows under D3 hybrid (atomic IU rows, hierarchy in identity_profile JSON, parent_or_container_ref NULL).
TRANSACTION (single COMMIT or single ROLLBACK):
BEGIN;
-- Build source set from live JOIN; ORDER BY render_order ASC.
-- All column names from resolved field map. Sample column names below are illustrative only.
FOR each source_row in (
SELECT lu.id AS tac_lu_id,
lu.<address_field> AS canonical_address,
lu.<section_type col> AS section_type,
lu.<sort_order col> AS sort_order,
lu.<owner column> AS owner_text,
lu.<parent_ref col> AS tac_parent_id,
uv.id AS tac_uv_id,
uv.<title col> AS title,
uv.<body col> AS body,
uv.<description col> AS description,
uv.<content_hash col> AS tac_v1_hash,
pm.<publication_render_order col> AS render_order,
p.id AS tac_pub_id,
p.<doc_code col> AS doc_code,
p.<publication_type col> AS publication_type
FROM tac_logical_unit lu
JOIN tac_unit_version uv ON uv.<lu_fk col> = lu.id
JOIN tac_publication_member pm ON pm.<logical_unit_ref> = lu.id
JOIN tac_publication p ON p.id = pm.<publication_ref>
WHERE p.<doc_code col> = pilot_doc_code
ORDER BY pm.<publication_render_order col> ASC
) LOOP
-- Step A: build identity_profile JSON
v_parent_canon := (
SELECT lu2.<address_field>
FROM tac_logical_unit lu2
WHERE lu2.id = source_row.tac_parent_id
); -- NULL for roots
v_identity_profile := jsonb_build_object(
'title' , source_row.title,
'owner_lookup_ref' , source_row.owner_text,
'primary_section_type_ref' , 'vocab.section_type.' || source_row.section_type,
'publication_authority_ref' , <DECIDED_BY_GPT — placeholder #4>,
'publication_type_ref' , 'vocab.publication_type.' || source_row.publication_type,
'tac_parent_anchor' , CASE WHEN source_row.tac_parent_id IS NULL THEN NULL ELSE source_row.tac_parent_id::text END,
'tac_parent_canonical_address', v_parent_canon,
'publication_anchor' , source_row.doc_code,
'publication_render_order' , source_row.render_order,
'tac_sort_order' , source_row.sort_order
);
-- Step B: call fn_iu_create (signature from Phase 5A G8)
v_result := fn_iu_create(
p_canonical_address := source_row.canonical_address,
p_title := source_row.title,
p_body := source_row.body,
p_actor := 'phase5c_migration_' || pilot_doc_code,
p_unit_kind := target_unit_kind,
p_section_type := source_row.section_type,
p_owner_ref := source_row.owner_text,
p_publication_type := source_row.publication_type,
p_parent_ref := NULL -- D3 hybrid: no parent containment at row level
);
-- v_result jsonb shape: live-verify on first row; do NOT assume key names
-- Step C: merge identity_profile (override hierarchy keys; preserve birth-gate keys fn_iu_create set)
UPDATE information_unit
SET identity_profile = COALESCE(identity_profile, '{}'::jsonb) || v_identity_profile
WHERE id = (v_result->>'<iu id key from live shape>')::uuid;
-- Step D: patch unit_version.content_profile with TAC provenance (Phase 4B addendum §7)
UPDATE unit_version
SET content_profile = COALESCE(content_profile, '{}'::jsonb) || jsonb_build_object(
'source_hashes', jsonb_build_object('tac_v1', source_row.tac_v1_hash),
'source_refs', jsonb_build_object(
'tac_logical_unit_id', source_row.tac_lu_id,
'tac_unit_version_id', source_row.tac_uv_id,
'tac_publication_id' , source_row.tac_pub_id,
'tac_doc_code' , source_row.doc_code,
'tac_render_order' , source_row.render_order
)
)
WHERE id = (v_result->>'<uv id key from live shape>')::uuid;
-- Step E: capture inserted keys for rollback
v_rollback_iu := array_append(v_rollback_iu, (v_result->>'<iu id key>')::uuid);
v_rollback_uv := array_append(v_rollback_uv, (v_result->>'<uv id key>')::uuid);
END LOOP;
-- Step F: in-transaction assertions (must all PASS or ROLLBACK)
ASSERT (SELECT count(*) FROM information_unit
WHERE (identity_profile->>'publication_anchor') = pilot_doc_code) = 36;
ASSERT (SELECT count(DISTINCT (identity_profile->>'publication_render_order')::int)
FROM information_unit
WHERE (identity_profile->>'publication_anchor') = pilot_doc_code) = 36;
ASSERT (SELECT count(*) FROM birth_registry br
WHERE br.collection_name = target_collection_primary
AND br.entity_code = ANY (v_rollback_iu::text[])
AND br.species_code IS NULL) = 0;
ASSERT (SELECT count(*) FROM unnest(v_rollback_iu) AS x(id)
WHERE (fn_iu_verify_invariants(
(SELECT canonical_address FROM information_unit WHERE id = x.id)
)->>'status') <> 'OK') = 0;
ASSERT (SELECT count(*) FROM unit_version uv
WHERE uv.id = ANY (v_rollback_uv)
AND uv.<content_hash col> <> fn_content_hash(uv.<body col>)) = 0;
-- Persist captured rollback lists to an artifact path before COMMIT
-- (e.g., write to /tmp or to a control table; design choice for executable prompt finalization)
COMMIT;
PASS criteria for Step 3:
- All 5 in-transaction assertions PASS.
- Captured rollback lists persisted to a known location.
- TAC source untouched (re-verify counts on tac_* tables match pre-step counts).
6. STEP 4 — Render fidelity check (post-COMMIT, read-only)
RF-1 Reconstruct DIEU-35 tree from migrated IU rows by self-joining on
identity_profile->>'tac_parent_canonical_address'. Compare structure
(parent → child pairs) with TAC source tree for DIEU-35.
Expected drift: 0 mismatched pairs, 0 missing leaves, 0 extra leaves.
RF-2 Body comparison: for each migrated row, compare unit_version.body
to the corresponding tac_unit_version.body. Expected: 100% match
(no transformation should have altered body text).
RF-3 render_order range: SELECT min, max, count(DISTINCT render_order)
from migrated set. Expected: min=0, max=35, count=36.
RF-4 Publication ordering: SELECT canonical_address FROM information_unit
WHERE publication_anchor = pilot_doc_code ORDER BY publication_render_order;
Compare against the same query on TAC source.
Expected: identical address sequence.
If RF-1..4 PASS → pilot accepted; report; await GPT/User confirmation.
If any RF fails → execute post-COMMIT rollback (§7.B).
7. ROLLBACK MODEL
7.A In-transaction rollback (Step 3)
Any in-transaction assertion failure → automatic ROLLBACK. No further action needed; state returns to pre-Step-3.
7.B Post-COMMIT rollback (RF-* failure or GPT/User reject)
BEGIN;
DELETE FROM birth_registry WHERE collection_name = target_collection_primary AND entity_code = ANY (<captured iu_id list>::text[]);
DELETE FROM unit_version WHERE id = ANY (<captured uv_id list>);
DELETE FROM information_unit WHERE id = ANY (<captured iu_id list>);
COMMIT;
Pattern-matching rollback (e.g., WHERE publication_anchor='DIEU-35') is PROHIBITED — defense in depth against accidental over-deletion.
7.C Backfill rollback (§G.5 of design)
UPDATE birth_registry SET species_code = NULL, composition_level = NULL
WHERE collection_name = target_collection_primary AND entity_code = ANY (<captured backfill entity_code list>);
7.D Species seed rollback (§F.3 of design)
DELETE FROM species_collection_map WHERE id = <captured Step 1.2 id>;
DELETE FROM entity_species WHERE id = <captured Step 1.1 id>;
Order matters: mapping first, then species.
7.E TAC source
Never touched. No DELETE/UPDATE on tac_* tables at any point.
8. RÀNG BUỘC LUẬT (binding)
Điều 0-B composition_level = atom under D3 hybrid (literal "không chứa entity khác");
see open question M.2 in design re: UV-as-contained interpretation.
Điều 0-G species_code auto-fills via species_collection_map; this prompt seeds the mapping
BEFORE pilot migration so all 36 new IU rows auto-assign at birth.
Điều 29 "One classification system"; new species + mapping + backfill done in this single pack.
QT-001 5-step backfill procedure followed (§2 PF-2, §4).
QT-005 If new species created: §3 Step 1 follows QT-005 step 1 (species creation);
steps 2–4 (mapping, backfill, verify) covered in this prompt.
Step 5 (governance_role promotion) DEFERRED — keep observed for pilot.
Điều 44 §A No code change; all operations metadata-driven.
§FIX_GỐC New species creation is gốc-fix for IU species gap, not patch.
§0-AU No hardcode; all selectors via semantic field registry + SCOPE BLOCK only.
§0-AF Evidence captured for production-style verification. Report includes live counts.
9. REPORT REQUIREMENTS (when dispatched, future)
Save to:
knowledge/dev/laws/dieu44-trien-khai/reports/p3d-pack1-phase5c-dieu35-pilot-migration-report.md
Required sections:
A. Status flags (PASS/PARTIAL/BLOCKED, written + rolled-back booleans, per-step pass/fail)
B. GATE-0 + PF-* live evidence
C. Step 1 species seed (entity_species id, species_collection_map id)
D. Step 2 backfill (12 entity_code list, before/after counts)
E. Step 3 pilot (36 iu_id list, 36 uv_id list, render_order range, assertion results)
F. Step 4 render fidelity (RF-1..4 results)
G. Live re-verify of source unchanged (TAC table row counts)
H. Open issues / deferred work
I. Rollback artifact path (captured key lists)
Status flag template:
phase5c_status = PASS | PARTIAL | BLOCKED
step1_species_seed = COMMITTED | NOT_RUN | ROLLED_BACK
step2_backfill_12_rows = COMMITTED | NOT_RUN | ROLLED_BACK
step3_pilot_dieu35 = COMMITTED | NOT_RUN | ROLLED_BACK
step4_render_fidelity = PASS | FAIL | NOT_RUN
no_mutation_outside_targets = true | false
tac_source_untouched = true | false
rollback_keys_persisted = true | false
post_implementation_design_due = true
10. PLACEHOLDERS REQUIRING GPT/USER RESOLUTION BEFORE DISPATCH
The following decisions are NOT inside this DRAFT and must be locked before the prompt becomes dispatch-ready. The first dispatch-ready revision (rev1) shall fill these in based on GPT/User direction.
#1 Species naming
- species_code (Opus proposal: 'information_unit_atom')
- entity_species code/id (Opus proposal: 'SPE-IUA')
- display_name (Vietnamese)
- prefix
- depth + parent_id in taxonomy tree
- kg_metadata
Status: PENDING GPT/USER
#2 UV species_collection_map seed (yes/no)
Opus default: NO (Phase 4D recommendation G1)
Status: PENDING GPT/USER CONFIRMATION
#3 governance_role promotion now vs defer
Opus default: DEFER (keep observed for pilot)
Status: PENDING GPT/USER CONFIRMATION
#4 identity_profile.publication_authority_ref value
The vocab key for "publication_authority_ref" — TAC publications don't have
an obvious authority field. Need fixed value (e.g., 'vocab.publication_authority.incomex')
or live lookup mechanism.
Status: PENDING GPT/USER CHOICE
#5 Rollback capture persistence target
Option A: write to /tmp on VPS with timestamped filename
Option B: insert into a transient control table (e.g., migration_rollback_capture)
Option C: agent writes to KB artifact path
Status: PENDING GPT/USER CHOICE
#6 D3b universal_edges enrichment (in this pack or deferred)
Opus default: DEFER to Phase 5E (after pilot verified)
Status: PENDING GPT/USER CONFIRMATION
#7 IU publication_member table design (in this pack or deferred)
Opus default: DEFER (use identity_profile.publication_anchor + publication_render_order for pilot/Phase 5D)
Status: PENDING GPT/USER CONFIRMATION
11. POST-IMPLEMENTATION DESIGN REQUIREMENT
After Agent submits the Phase 5C report and GPT/User accepts it, the assigned drafter (per operating-notes/design-after-repair-implementation-rule-2026-05-11.md) writes the post-implementation design at:
knowledge/dev/laws/dieu44-trien-khai/design/p3d-pack1-phase5c-post-implementation-design-dieu35-pilot-migration-<date>.md
Sections A–G per the operating note. Opus default assignee unless GPT directs otherwise.
12. NO-HARDCODE DISCIPLINE (carried from Phase 5A)
Allowed hardcoded selectors:
- target_collection_primary, target_collection_subordinate, target_unit_kind
- source_table_logical, source_table_version, source_table_publication, source_table_pub_member
- pilot_doc_code (= 'DIEU-35') — design choice; live re-verify exists
- target_vocab_key (= 'vocab.unit_kind.law_unit')
- proposed_* (species code/composition) only AFTER §10 resolution; pre-resolution they are placeholders
Forbidden:
- column names outside the rev6 semantic field registry resolution
- row counts as gates (counts are evidence, not selectors)
- section_type literal values outside live SELECT
- fuzzy matchers ('like', 'or similar', 'equivalent')
- assumptions about jsonb shape returned by fn_iu_create (live-verify first row)
- any selector that says "the only existing X" without live count proof
13. SAFETY SUMMARY
Read-only checks done before any write: GATE-0 + PF-1..5 + dry-construct identity_profile (PF-4).
First write: §3 Step 1 species + mapping. Two rows. Reversible.
Second write: §4 Step 2 backfill 12 birth rows. Reversible.
Third write: §5 Step 3 pilot transaction. 36 IU + 36 UV + auto birth rows. Reversible.
TAC source: NEVER written.
Rollback keys: captured at every write step; pattern-matching rollback PROHIBITED.
Post-COMMIT rollback path: explicit (§7.B).
Render fidelity check: read-only (§6).
Phase 5C Execution Prompt — DRAFT | D3 hybrid (D3a) | Pilot DIEU-35 | QT-005 + QT-001 + Pilot migration | NOT DISPATCH-READY (§10 placeholders pending) | 2026-05-11