P3D Pack 1 Phase 4 — Governance/Vocab Discovery Report
P3D Pack 1 Phase 4 — Governance/Vocab Discovery Report
Date: 2026-05-11 Executor: Claude Opus 4.7 (1M) agent Mode: READ-ONLY DISCOVERY (PGOPTIONS=-c default_transaction_read_only=on + SET confirmed
on) Source prompt: knowledge/dev/laws/dieu44-trien-khai/prompts/p3d-pack1-phase4-readonly-governance-vocab-discovery-prompt.md (rev2, GPT-approved 2026-05-11) Approval: knowledge/dev/laws/dieu44-trien-khai/reviews/gpt-final-approval-p3d-pack1-phase4-readonly-discovery-prompt-rev2-2026-05-11.md VPS: 38.242.240.89, container=postgres, db=directus, dbuser=directus Log: /tmp/p3d-phase4-discovery-20260511-064954.log (on VPS)
All numeric counts below are live snapshots from this discovery run on 2026-05-11, not constants.
Final response (machine-readable)
phase_status=PARTIAL
mode=READ_ONLY_DISCOVERY
no_mutation_performed=true
no_hardcode_compliance=true
pg_readonly_confirmed=true
vocab_delta_summary=14 missing codes vs TAC live use: 1 publication_type (law), 12 section_types (appendix, article, changelog, checklist, definition, governance_process, heading, instruction_block, paragraph, principle, process, technical_spec), 1 unit_kind (law_unit). All values are live-discovery candidates; no seed executed.
species_mapping_current_state=species_collection_map has 0 rows for information_unit/unit_version/any tac_*. entity_species has 40 rows including SPE-LAW (compound, governed, prefix=LAW) but no species_code is mapped to the IU/UV collections. Result: IU/UV born without species/composition; 12/12 IU birth_registry rows show species_code=NULL, composition_level=NULL.
collection_governance_current_state=information_unit (COL-176) and unit_version (COL-177) both governance_role=observed, migration_state=pilot, source_kind=native, species_code=NULL, description_policy=structured_exempt. No tac_* collection is in collection_registry at all. entity_labels has 0 rows for IU/UV. universal_edges has 0 rows referencing IU/UV.
hash_provenance_target=needs_design — no `source_hash` or `*_source_hash` column exists on information_unit or unit_version. Candidate jsonb carriers: unit_version.content_profile (jsonb) and unit_version.provenance (text). information_unit.identity_profile (jsonb) is the only IU profile column. No real field is yet reserved for TAC source-hash provenance; this must be design-locked before any TAC→IU migration.
seed_candidates_from_live_delta=[unit_kind: law_unit] + [publication_type: law] + [section_type: appendix, article, changelog, checklist, definition, governance_process, heading, instruction_block, paragraph, principle, process, technical_spec]. SUBJECT TO GPT/USER APPROVAL — not approved for execution.
implementation_allowed=false
next_recommended_action=Submit this discovery report to GPT/User for review. Request explicit approval of (a) the 14 vocab seed candidates above, (b) a species mapping strategy for information_unit/unit_version (seed species_collection_map vs accept NULL), (c) a parent_or_container_ref + composition_role policy for the TAC 3-root/83-child hierarchy snapshot, (d) a hash-provenance target field decision (extend unit_version.content_profile jsonb key vs new column vs traceability table). No Phase 5 migration until all four decisions land.
log_path=/tmp/p3d-phase4-discovery-20260511-064954.log
1. Read-only session confirmation
SET
default_transaction_read_only
-------------------------------
on
(1 row)
PGOPTIONS=-c default_transaction_read_only=on also passed at connection level for every docker exec call. No DDL/DML executed.
2. Schema introspection (I1, I2)
All 13 target tables exist as relkind='r'.
| table | exists | notes |
|---|---|---|
| birth_registry | yes | 19 cols incl. species_code, composition_level, governance_role |
| collection_registry | yes | 29 cols incl. governance_role, source_kind, migration_state, species_code |
| dot_config | yes | (key,value,description,updated_at) |
| entity_labels | yes | (id, entity_code, label_code, assigned_by, rule_id, assigned_at) |
| entity_species | yes | 18 cols incl. code, species_code, composition_level, management_mode, prefix |
| information_unit | yes | 19 cols. No source_hash column. identity_profile is jsonb |
| species_collection_map | yes | 13 cols incl. species_code, collection_name, is_primary, discriminator_* |
| tac_logical_unit | yes | 13 cols (id, canonical_address, doc_code, parent_id, sort_order, section_type, section_code, owner, identity_profile, tier, lifecycle_status, *_at) |
| tac_publication | yes | 15 cols incl. publication_type, publication_profile (jsonb) |
| tac_publication_member | yes | 6 cols |
| tac_unit_version | yes | 20 cols incl. content_hash, content_profile, provenance |
| unit_version | yes | 16 cols incl. content_hash, content_profile (jsonb), provenance (text) |
| universal_edges | yes | 25 cols incl. source_collection, target_collection, edge_type, metadata (jsonb), provenance (jsonb) |
3. Vocab discovery (V1–V7)
V1 — TAC section_type live usage on tac_logical_unit (snapshot 86 LU rows total)
section_type | used_count
-------------------+-----------
technical_spec | 19
governance_process | 16
paragraph | 14
heading | 10
principle | 8
process | 7
checklist | 6
changelog | 2
appendix | 1
article | 1
definition | 1
instruction_block | 1
12 distinct codes in live use.
V2 — tac_section_type_vocab (17 rows)
Codes: appendix, article, changelog, checklist, definition, governance_process, heading, instruction_block, invariant_list, matrix, open_decision_list, paragraph, principle, process, rationale, reference_mapping, technical_spec. All lifecycle_status=active.
V3 — TAC publication_type live usage on tac_publication
publication_type | used_count
-----------------+-----------
law | 3
V4 — tac_publication_type_vocab (10 rows)
Codes: constitution, design_note, draft, knowledge, law, memo, policy, report, sop, working. All lifecycle_status=active.
V5 — IU dot_config vocab snapshot
key | value
-----------------------------------+-------------------
vocab.publication_type.design_doc | design_doc
vocab.section_type.section | section
vocab.unit_kind.design_doc_section | design_doc_section
Only 3 vocab entries on the IU side. Matches Phase 3 finding.
V6 — Vocab delta candidates (live-computed, no seed executed)
delta_type | code
-------------------------------+--------------------
publication_type_missing_in_iu | law
section_type_missing_in_iu | appendix
section_type_missing_in_iu | article
section_type_missing_in_iu | changelog
section_type_missing_in_iu | checklist
section_type_missing_in_iu | definition
section_type_missing_in_iu | governance_process
section_type_missing_in_iu | heading
section_type_missing_in_iu | instruction_block
section_type_missing_in_iu | paragraph
section_type_missing_in_iu | principle
section_type_missing_in_iu | process
section_type_missing_in_iu | technical_spec
unit_kind_missing_in_iu | law_unit
Total = 14 missing vocab codes. law_unit appears here as a discovered requirement from the active migration design; it is not an approved constant. All 14 entries are seed candidates pending GPT/User approval, not approved mutations.
V7 — TAC lifecycle / review vocab snapshots
tac_lu_lifecycle_vocab: 3 rows (active, draft_only, retired)tac_uv_lifecycle_vocab: 4 rows (draft, enacted, retired, superseded)tac_pub_lifecycle_vocab: 4 rows (proposed, enacted, superseded, retired)tac_review_state_vocab: 5 rows (unreviewed, in_review, review_passed, review_failed, needs_re_review)
No equivalent vocab tables exist for IU/UV; IU/UV lifecycle is currently free-text columns gated by fn_iu_birth_gate_layer1 only.
4. Species / composition / governance discovery (S1–S5)
S1 — entity_species (40 rows snapshot)
Includes governed law-domain species:
| code | species_code | composition_level | management_mode | prefix |
|---|---|---|---|---|
| SPE-LAW | law | compound | governed | LAW |
| SPE-JUR | jurisdiction | atom | governed | JUR |
| SPE-ENF | law_enforcement | atom | governed | ENF |
| SPE-GRL | governance_relation | atom | governed | GRL |
| SPE-GAG | governance_agency | compound | governed | GOV |
| SPE-APP | approval_request | compound | governed | APR- |
No species row exists whose code or species_code is information_unit/unit_version. The IU/UV collections are not wired into the species graph.
S2 — collection_registry rows for target/source collections
id | code | name | governance_role | source_kind | migration_state | species_code | description_policy | birth_code_strategy
----+---------+------------------+-----------------+-------------+-----------------+--------------+---------------------+--------------------
184 | COL-176 | information_unit | observed | native | pilot | (NULL) | structured_exempt | synthetic_id
185 | COL-177 | unit_version | observed | native | pilot | (NULL) | structured_exempt | subordinate
No tac_* collection appears in collection_registry (search by jsonb text returned 0 tac_logical_unit / tac_unit_version / tac_publication rows).
S3 — species_collection_map: 0 rows
species_collection_map has 0 rows for information_unit, unit_version, tac_logical_unit, tac_unit_version, tac_publication. The map exists (153 rows total per Phase 3) but contains nothing for these collections.
S4 — birth_registry evidence
collection_name | rows | rows_with_species | rows_with_composition
-----------------+------+-------------------+----------------------
information_unit | 12 | 0 | 0
Zero rows for unit_version, tac_logical_unit, tac_unit_version, tac_publication, tac_publication_member. All 12 IU births registered without species/composition.
S5 — entity_labels for IU/UV: 0 rows
labels_for_iu_uv
-----------------
0
5. Parent-child hierarchy snapshot (P1, P2)
max_depth_snapshot | reachable_nodes | all_nodes | root_count | child_count
-------------------+-----------------+-----------+------------+------------
2 | 86 | 86 | 3 | 83
- All 86 logical units are reachable from the 3 roots (no orphans).
- Max depth = 2 snapshot — 3 roots → children → grandchildren.
orphan_parent_refs= 0 (every non-null parent_id resolves to an existing row).
Counts are a snapshot of the current TAC corpus only; they will change as new TAC rows are written.
6. universal_edges discovery (E1–E3)
E1 — 25 columns
Confirmed column names: source_collection, source_id, source_code, source_composition_level, target_collection, target_id, target_code, target_composition_level, edge_type, edge_subtype, weight, source_info, is_auto_managed, symmetry_group_id, metadata (jsonb), valid_from, valid_to, status, _dot_origin, date_created, date_updated, confidence, valid_time (tstzrange), provenance (jsonb), plus id.
(Older reports speculated about source_entity/source_ref style names — confirmed not present; the live schema uses source_collection/source_id/source_code triplets.)
E2 — edge_type distribution
edge_type | count
-----------+------
USES | 1486
BELONGS_TO | 431
CONTAINS | 282
Total = 2199 edges. (The DO-block notice probe surfaced a benign PL/pgSQL RAISE NOTICE quoting error which did not affect the subsequent SELECT; the distribution result above is valid.)
E3 — IU/UV edge text-search count
iu_uv_edges_text_search
-----------------------
0
Confirms Phase 3: no universal_edges rows reference information_unit or unit_version by any of the jsonb-serialised columns.
7. Hash provenance target discovery (H1)
table_name | column_name | data_type
-----------------+------------------+----------
information_unit | identity_profile | jsonb
unit_version | content_hash | text
unit_version | content_profile | jsonb
unit_version | provenance | text
- No column named
source_hash,tac_source_hash,origin_hash, or similar exists oninformation_unitorunit_version. unit_version.content_hashalready carries the SHA-256 of the IU body (per Phase 3 findingfn_content_hash). Reusing it for TAC source provenance would conflate two different hash semantics.- The two available jsonb carriers are
unit_version.content_profileand (less suitable)information_unit.identity_profile.unit_version.provenanceis a freetextcolumn. - The Phase 4 design proposal
content_profile.source_hashis design-only: no schema migration has reserved the key, no constraint enforces it, no trigger writes it. Result:hash_provenance_target=needs_design.
8. Critical blocker notes
- Vocab gap is the migration blocker, not a planner bug. IU vocab on the live system has 1 unit_kind / 1 section_type / 1 publication_type. To migrate the 86 TAC LU rows + 3 TAC publications without losing semantics, the IU side needs (at minimum) the 14 live-delta codes above seeded into
dot_config. None of these are approved for execution by this discovery run. - Species/composition pipeline is not wired for IU/UV.
species_collection_mapis empty for both COL-176 and COL-177; all 12 existing IU births carryspecies_code=NULL,composition_level=NULL. SPE-LAW (compound, governed, prefix=LAW) is available as a target species, but no mapping row links it toinformation_unit/unit_version. Migration must either seed the map or explicitly accept post-migration NULL species (with governance consequences). - TAC tables are entirely outside the governance universe.
tac_logical_unit,tac_unit_version,tac_publication,tac_publication_memberhave 0 rows incollection_registry, 0 rows inbirth_registry, 0 rows inspecies_collection_map, and 0 rows inentity_labels/universal_edges. TAC→IU is not a transparent rebadge; it requires explicit governance pipeline wiring on the IU side or a documented deferral. - Hash provenance target is not a real field. The design assumption
content_profile.source_hashis not yet a reserved jsonb key or column. Phase 5 must lock the target (column, jsonb key path, or new traceability table) before any TAC→IU body copy is performed, or the TAC source-hash continuity will be lost. - Parent-child policy required. TAC hierarchy snapshot is depth=2 with 3 roots / 83 children / 0 orphans. IU has
parent_or_container_ref(uuid) but Phase 3 confirmed it is NULL on all 12 existing IU rows and not enforced at birth. Phase 5 must decide: replicate the parent_id structure in IU, materialise asuniversal_edgesCONTAINS rows, both, or neither. - collection_registry promotion gate.
information_unit/unit_versionaregovernance_role=observed,migration_state=pilot. If Phase 5 migration is to be authoritative,governance_rolemust be elevated togovernedfirst — which currently is blocked by the species/composition/labels gaps above.
9. What is NOT in scope of this discovery
- No DDL was executed. No vocab seeds, no species_collection_map seeds, no collection_registry updates, no trigger/function changes, no Directus or Qdrant mutations, no DOT-119 execution.
- All counts in this report are snapshots taken on 2026-05-11 during the discovery run, not constants. They will change with system activity.
law_unitis reported as a proposed targetunit_kinddiscovered from the active migration design (Phase 3 blocker), not as an approved seed value. Implementation prompt (DRAFT) remains BLOCKED until GPT/User approve the seed set explicitly.
10. Logs and evidence
- Discovery log on VPS:
/tmp/p3d-phase4-discovery-20260511-064954.log - Script on VPS:
/tmp/p3d-phase4-discovery.sh - V6 delta SQL (clean re-run after shell-escape issue):
/tmp/v6-delta.sql(host + container)
Two cosmetic execution issues in the initial pass — neither affected results:
- V6 inline heredoc had backslashes consumed by the outer shell; the query was re-executed from a file (results above).
- E2 DO-block
RAISE NOTICEraised a quoting error (unrecognized exception condition) but the subsequentSELECT edge_type, count(*)ran successfully and produced the distribution shown.
11. Status block
phase4_discovery_status=PARTIAL
implementation_allowed=false
migration_allowed=false
seed_executed=false
mutation_performed=false
readonly_discovery_completed=true
no_hardcode_compliance=true
next_recommended_action=Route this report to GPT/User for explicit approval of (a) 14 vocab seed candidates, (b) species mapping for COL-176/COL-177, (c) parent_or_container_ref + composition_role policy, (d) hash-provenance target field. Implementation prompt (DRAFT) stays blocked until all four decisions land.