KB-77E8

P3D Pack 1 Phase 4 — Governed Migration Readiness + No-Hardcode Design

9 min read Revision 1
p3dpack1phase4no-hardcodemigration-readinesshashvocabspeciesgovernance

P3D Pack 1 Phase 4 — Governed Migration Readiness + No-Hardcode Design

Date: 2026-05-11 | Author: Opus 4.7 | Mode: DESIGN ONLY


§A. Phase 3 Blocker Summary

Blocker Finding Status
Hash divergence TAC=sha256(title|body|desc|profile), IU=sha256(body) DECISION NEEDED
Vocab empty IU has 1 unit_kind, 1 section_type, 1 pub_type vs TAC has N each SEED NEEDED
Species not wired species_collection_map=0 for IU/UV SEED OR DEFER
Composition absent parent_or_container_ref=NULL, edges=0 STRATEGY NEEDED
Governance pilot collection_registry=observed/pilot PROMOTE DECISION

§B. No-Hardcode Contract

Binding rule for all future prompts/code/design in this workstream:

  1. NO fixed counts: Never write "17 section types" or "10 pub types" as implementation constants. Always: SELECT DISTINCT ... FROM live_table.
  2. NO memory-typed lists: Never type vocab values from memory. Always: SELECT code FROM vocab_table WHERE lifecycle_status='active'.
  3. Discovery-first pattern: Before any seed/migration, run discovery query → compare with target → generate delta → execute delta.
  4. Snapshot disclaimer: Any count cited in design is "snapshot at [date]" — must re-verify before gating.
  5. Registry-driven: If a behavior depends on "how many species" or "how many layers", it must read from registry at runtime, not from code constants.

Verification: Every implementation prompt must include a preflight that discovers live values and compares with expected. If delta > 0, STOP and report — don't assume.

§C. Hash Reconciliation — Recommendation: RECOMPUTE + PROVENANCE

Aspect Decision
IU rule remains: sha256(body::UTF8) Keep — clean, body-only, fn_content_hash IMMUTABLE
TAC hash on migration: recompute under IU rule Migrated rows get new content_hash = sha256(body)
TAC original hash: store as provenance Save in content_profile.source_hash for traceability
Drift verification post-migration: use IU rule Compare content_hash = fn_content_hash(body) — unified, no false alarms
Round-trip verification: body text comparison Same P10A/P10B method — compare assembled text, not hash

Why not preserve TAC hash: Mixed hash rules in one table = every future drift check must know "which rule was this row born under?" — violates §0-AU (no hardcode) and makes DOT checking complex.

Why store source_hash: Provenance. If someone asks "did the content change during migration?" → compare TAC source_hash with pre-migration TAC row.

§D. Vocab Seeding — Discovery-First, No Hardcode

Pattern: Agent runs discovery query → generates INSERT statements → GPT reviews → agent executes.

Discovery queries (Phase 4 read-only prompt):

-- D1: Live TAC section_type values actually used
SELECT DISTINCT section_type FROM tac_logical_unit ORDER BY section_type;

-- D2: TAC section_type_vocab (may have more than used)
SELECT code, name, lifecycle_status FROM tac_section_type_vocab WHERE lifecycle_status='active' ORDER BY code;

-- D3: Live TAC publication_type values actually used
SELECT DISTINCT publication_type FROM tac_publication ORDER BY publication_type;

-- D4: TAC publication_type_vocab
SELECT code, name FROM tac_publication_type_vocab WHERE lifecycle_status='active' ORDER BY code;

-- D5: Current IU vocab in dot_config
SELECT key, value FROM dot_config WHERE key LIKE 'vocab.%' ORDER BY key;

-- D6: Delta = TAC values not in IU vocab
-- (computed by agent after D1-D5 results)

Seed generation (implementation prompt):

Agent computes delta from D1-D5 results → generates:

INSERT INTO dot_config (key, value, ...) VALUES
  ('vocab.unit_kind.law_unit', 'law_unit', ...),
  -- ... remaining from discovery delta
ON CONFLICT DO NOTHING;

No values typed from memory. All from live query results.

§E. Species/Composition/Governance Policy

Species

Decision Rationale
Seed species_collection_map for IU/UV 2 rows. Low effort. Birth_registry will auto-populate species_code for new births.
Species_code for IU: discover from entity_species Query: SELECT species_code FROM entity_species WHERE ... — find appropriate species or create new if none fits
Do NOT enforce species at birth gate Keep Tier-0 birth gate. Species = DOT enrichment (Tier 2). Consistent with P38-XC §4.

Composition

Decision Rationale
Map parent_id → parent_or_container_ref directly during migration Column mapping, not new structure. TAC hierarchy preserved.
Do NOT materialize universal_edges at migration time Edge materialization = DOT Tier 1. Migration just copies parent pointer.
Do NOT enforce parent at birth gate Keep Tier-0 birth gate. Parent = optional (NULL valid for root units).
composition_role in identity_profile: set based on parent_id Has children → 'molecular'. No children → 'atomic'. Discovery query determines.

Governance Promotion

Decision Rationale
After pilot migration PASS: promote IU from observed→governed, pilot→active Staged. Don't promote before proving migration works.
Update collection_registry: governance_role, migration_state, species_code Single UPDATE after pilot verify.

§F. Parent-Child / Edge Policy

Item Strategy
TAC parent_id → IU parent_or_container_ref Direct column map during migration INSERT
TAC sort_order → IU sort_order Direct column map
universal_edges contains DEFER — DOT enrichment post-migration. Not at migration time.
Hierarchy depth DISCOVER from TAC: WITH RECURSIVE ... SELECT max(depth) — no assumed "6 levels"

§G. Phase 4 Allowed Implementation Sequence

If GPT approves this design:

Step What Scope Gate
4.0 Discovery prompt (read-only) Live query all vocab/species/registry/hierarchy No mutation
4.1 Vocab seed INSERT dot_config from discovery delta GPT review discovery results
4.2 Species seed INSERT species_collection_map 2 rows for IU/UV GPT review
4.3 Governance prepare (no promote yet) Document promote criteria Design only
4.4 Pilot migration design 1 doc (~23-36 rows) migration prompt DRAFT GPT review

Phase 5: Pilot migration execution → verify → promote → full migration → view creation → Nuxt adapt.

§H. Phase 5 Migration Gates (all must PASS before full migration)

# Gate Verification
G1 Vocab seeded: all TAC-used values present in IU dot_config Discovery re-verify
G2 Species_collection_map has IU/UV entries SELECT count
G3 Hash policy implemented (recompute + source_hash provenance) Migration script review
G4 Pilot: 1 doc migrated, round-trip 0 drift P10A/P10B method
G5 Pilot: publication_member FK valid after migration SELECT verify
G6 Pilot: gateway protects migrated rows Test INSERT without marker → blocked
G7 Pilot: birth_registry has entries for migrated rows SELECT count
G8 Pilot: parent_or_container_ref mapped correctly Tree query verify
G9 Pilot: rollback tested (DELETE migrated IU rows → TAC intact) Rollback + TAC count verify
G10 Governance: collection_registry promoted to governed/active UPDATE after pilot PASS

§I. Risks, Rollback, Observability

Risk Mitigation
Vocab discovery misses edge case Discovery queries cover both vocab tables AND live distinct values
Hash recompute loses TAC provenance source_hash in content_profile preserves original
Migration creates duplicate canonical_address Pre-check: 0 overlap confirmed (Phase 3 S11)
Parent_or_container_ref FK fails TAC parent_id is self-ref within same table → after migration, same UUIDs exist in IU
Species seed wrong Only 2 rows, easy to DELETE if wrong

Rollback: Phase 4 is mostly vocab/species seed (small INSERTs). Rollback = DELETE those rows. Phase 5 pilot rollback = DELETE migrated IU/UV rows.

Observability: Canary test (dot-search-canary) unaffected. New canary cases for migration can be added after pilot.

§J. Recommendation

Proceed with Phase 4 discovery prompt → GPT reviews discovery results → seed vocab/species → design pilot migration prompt. No migration until Phase 5 gates G1-G10 all PASS.


Phase 4 Design | No-Hardcode | 2026-05-11

Back to Knowledge Hub knowledge/dev/laws/dieu44-trien-khai/design/p3d-pack1-phase4-governed-migration-readiness-no-hardcode-design.md