KB-1C9B

P3D Pack 1 Phase 4 — Governance/Vocab Discovery Report

18 min read Revision 1
p3dpack1phase4discoveryreadonlyvocabspeciesgovernanceno-hardcode2026-05-11

P3D Pack 1 Phase 4 — Governance/Vocab Discovery Report

Date: 2026-05-11 Executor: Claude Opus 4.7 (1M) agent Mode: READ-ONLY DISCOVERY (PGOPTIONS=-c default_transaction_read_only=on + SET confirmed on) Source prompt: knowledge/dev/laws/dieu44-trien-khai/prompts/p3d-pack1-phase4-readonly-governance-vocab-discovery-prompt.md (rev2, GPT-approved 2026-05-11) Approval: knowledge/dev/laws/dieu44-trien-khai/reviews/gpt-final-approval-p3d-pack1-phase4-readonly-discovery-prompt-rev2-2026-05-11.md VPS: 38.242.240.89, container=postgres, db=directus, dbuser=directus Log: /tmp/p3d-phase4-discovery-20260511-064954.log (on VPS)

All numeric counts below are live snapshots from this discovery run on 2026-05-11, not constants.


Final response (machine-readable)

phase_status=PARTIAL
mode=READ_ONLY_DISCOVERY
no_mutation_performed=true
no_hardcode_compliance=true
pg_readonly_confirmed=true
vocab_delta_summary=14 missing codes vs TAC live use: 1 publication_type (law), 12 section_types (appendix, article, changelog, checklist, definition, governance_process, heading, instruction_block, paragraph, principle, process, technical_spec), 1 unit_kind (law_unit). All values are live-discovery candidates; no seed executed.
species_mapping_current_state=species_collection_map has 0 rows for information_unit/unit_version/any tac_*. entity_species has 40 rows including SPE-LAW (compound, governed, prefix=LAW) but no species_code is mapped to the IU/UV collections. Result: IU/UV born without species/composition; 12/12 IU birth_registry rows show species_code=NULL, composition_level=NULL.
collection_governance_current_state=information_unit (COL-176) and unit_version (COL-177) both governance_role=observed, migration_state=pilot, source_kind=native, species_code=NULL, description_policy=structured_exempt. No tac_* collection is in collection_registry at all. entity_labels has 0 rows for IU/UV. universal_edges has 0 rows referencing IU/UV.
hash_provenance_target=needs_design — no `source_hash` or `*_source_hash` column exists on information_unit or unit_version. Candidate jsonb carriers: unit_version.content_profile (jsonb) and unit_version.provenance (text). information_unit.identity_profile (jsonb) is the only IU profile column. No real field is yet reserved for TAC source-hash provenance; this must be design-locked before any TAC→IU migration.
seed_candidates_from_live_delta=[unit_kind: law_unit] + [publication_type: law] + [section_type: appendix, article, changelog, checklist, definition, governance_process, heading, instruction_block, paragraph, principle, process, technical_spec]. SUBJECT TO GPT/USER APPROVAL — not approved for execution.
implementation_allowed=false
next_recommended_action=Submit this discovery report to GPT/User for review. Request explicit approval of (a) the 14 vocab seed candidates above, (b) a species mapping strategy for information_unit/unit_version (seed species_collection_map vs accept NULL), (c) a parent_or_container_ref + composition_role policy for the TAC 3-root/83-child hierarchy snapshot, (d) a hash-provenance target field decision (extend unit_version.content_profile jsonb key vs new column vs traceability table). No Phase 5 migration until all four decisions land.
log_path=/tmp/p3d-phase4-discovery-20260511-064954.log

1. Read-only session confirmation

SET
 default_transaction_read_only
-------------------------------
 on
(1 row)

PGOPTIONS=-c default_transaction_read_only=on also passed at connection level for every docker exec call. No DDL/DML executed.

2. Schema introspection (I1, I2)

All 13 target tables exist as relkind='r'.

table exists notes
birth_registry yes 19 cols incl. species_code, composition_level, governance_role
collection_registry yes 29 cols incl. governance_role, source_kind, migration_state, species_code
dot_config yes (key,value,description,updated_at)
entity_labels yes (id, entity_code, label_code, assigned_by, rule_id, assigned_at)
entity_species yes 18 cols incl. code, species_code, composition_level, management_mode, prefix
information_unit yes 19 cols. No source_hash column. identity_profile is jsonb
species_collection_map yes 13 cols incl. species_code, collection_name, is_primary, discriminator_*
tac_logical_unit yes 13 cols (id, canonical_address, doc_code, parent_id, sort_order, section_type, section_code, owner, identity_profile, tier, lifecycle_status, *_at)
tac_publication yes 15 cols incl. publication_type, publication_profile (jsonb)
tac_publication_member yes 6 cols
tac_unit_version yes 20 cols incl. content_hash, content_profile, provenance
unit_version yes 16 cols incl. content_hash, content_profile (jsonb), provenance (text)
universal_edges yes 25 cols incl. source_collection, target_collection, edge_type, metadata (jsonb), provenance (jsonb)

3. Vocab discovery (V1–V7)

V1 — TAC section_type live usage on tac_logical_unit (snapshot 86 LU rows total)

section_type       | used_count
-------------------+-----------
technical_spec     |        19
governance_process |        16
paragraph          |        14
heading            |        10
principle          |         8
process            |         7
checklist          |         6
changelog          |         2
appendix           |         1
article            |         1
definition         |         1
instruction_block  |         1

12 distinct codes in live use.

V2 — tac_section_type_vocab (17 rows)

Codes: appendix, article, changelog, checklist, definition, governance_process, heading, instruction_block, invariant_list, matrix, open_decision_list, paragraph, principle, process, rationale, reference_mapping, technical_spec. All lifecycle_status=active.

V3 — TAC publication_type live usage on tac_publication

publication_type | used_count
-----------------+-----------
law              |          3

V4 — tac_publication_type_vocab (10 rows)

Codes: constitution, design_note, draft, knowledge, law, memo, policy, report, sop, working. All lifecycle_status=active.

V5 — IU dot_config vocab snapshot

key                                | value
-----------------------------------+-------------------
vocab.publication_type.design_doc  | design_doc
vocab.section_type.section         | section
vocab.unit_kind.design_doc_section | design_doc_section

Only 3 vocab entries on the IU side. Matches Phase 3 finding.

V6 — Vocab delta candidates (live-computed, no seed executed)

delta_type                     | code
-------------------------------+--------------------
publication_type_missing_in_iu | law
section_type_missing_in_iu     | appendix
section_type_missing_in_iu     | article
section_type_missing_in_iu     | changelog
section_type_missing_in_iu     | checklist
section_type_missing_in_iu     | definition
section_type_missing_in_iu     | governance_process
section_type_missing_in_iu     | heading
section_type_missing_in_iu     | instruction_block
section_type_missing_in_iu     | paragraph
section_type_missing_in_iu     | principle
section_type_missing_in_iu     | process
section_type_missing_in_iu     | technical_spec
unit_kind_missing_in_iu        | law_unit

Total = 14 missing vocab codes. law_unit appears here as a discovered requirement from the active migration design; it is not an approved constant. All 14 entries are seed candidates pending GPT/User approval, not approved mutations.

V7 — TAC lifecycle / review vocab snapshots

  • tac_lu_lifecycle_vocab: 3 rows (active, draft_only, retired)
  • tac_uv_lifecycle_vocab: 4 rows (draft, enacted, retired, superseded)
  • tac_pub_lifecycle_vocab: 4 rows (proposed, enacted, superseded, retired)
  • tac_review_state_vocab: 5 rows (unreviewed, in_review, review_passed, review_failed, needs_re_review)

No equivalent vocab tables exist for IU/UV; IU/UV lifecycle is currently free-text columns gated by fn_iu_birth_gate_layer1 only.

4. Species / composition / governance discovery (S1–S5)

S1 — entity_species (40 rows snapshot)

Includes governed law-domain species:

code species_code composition_level management_mode prefix
SPE-LAW law compound governed LAW
SPE-JUR jurisdiction atom governed JUR
SPE-ENF law_enforcement atom governed ENF
SPE-GRL governance_relation atom governed GRL
SPE-GAG governance_agency compound governed GOV
SPE-APP approval_request compound governed APR-

No species row exists whose code or species_code is information_unit/unit_version. The IU/UV collections are not wired into the species graph.

S2 — collection_registry rows for target/source collections

id  | code    | name             | governance_role | source_kind | migration_state | species_code | description_policy  | birth_code_strategy
----+---------+------------------+-----------------+-------------+-----------------+--------------+---------------------+--------------------
184 | COL-176 | information_unit | observed        | native      | pilot           | (NULL)       | structured_exempt   | synthetic_id
185 | COL-177 | unit_version     | observed        | native      | pilot           | (NULL)       | structured_exempt   | subordinate

No tac_* collection appears in collection_registry (search by jsonb text returned 0 tac_logical_unit / tac_unit_version / tac_publication rows).

S3 — species_collection_map: 0 rows

species_collection_map has 0 rows for information_unit, unit_version, tac_logical_unit, tac_unit_version, tac_publication. The map exists (153 rows total per Phase 3) but contains nothing for these collections.

S4 — birth_registry evidence

collection_name  | rows | rows_with_species | rows_with_composition
-----------------+------+-------------------+----------------------
information_unit |   12 |                 0 |                     0

Zero rows for unit_version, tac_logical_unit, tac_unit_version, tac_publication, tac_publication_member. All 12 IU births registered without species/composition.

S5 — entity_labels for IU/UV: 0 rows

labels_for_iu_uv
-----------------
              0

5. Parent-child hierarchy snapshot (P1, P2)

max_depth_snapshot | reachable_nodes | all_nodes | root_count | child_count
-------------------+-----------------+-----------+------------+------------
                 2 |              86 |        86 |          3 |         83
  • All 86 logical units are reachable from the 3 roots (no orphans).
  • Max depth = 2 snapshot — 3 roots → children → grandchildren.
  • orphan_parent_refs = 0 (every non-null parent_id resolves to an existing row).

Counts are a snapshot of the current TAC corpus only; they will change as new TAC rows are written.

6. universal_edges discovery (E1–E3)

E1 — 25 columns

Confirmed column names: source_collection, source_id, source_code, source_composition_level, target_collection, target_id, target_code, target_composition_level, edge_type, edge_subtype, weight, source_info, is_auto_managed, symmetry_group_id, metadata (jsonb), valid_from, valid_to, status, _dot_origin, date_created, date_updated, confidence, valid_time (tstzrange), provenance (jsonb), plus id.

(Older reports speculated about source_entity/source_ref style names — confirmed not present; the live schema uses source_collection/source_id/source_code triplets.)

E2 — edge_type distribution

edge_type  | count
-----------+------
USES       | 1486
BELONGS_TO |  431
CONTAINS   |  282

Total = 2199 edges. (The DO-block notice probe surfaced a benign PL/pgSQL RAISE NOTICE quoting error which did not affect the subsequent SELECT; the distribution result above is valid.)

E3 — IU/UV edge text-search count

iu_uv_edges_text_search
-----------------------
                      0

Confirms Phase 3: no universal_edges rows reference information_unit or unit_version by any of the jsonb-serialised columns.

7. Hash provenance target discovery (H1)

table_name       | column_name      | data_type
-----------------+------------------+----------
information_unit | identity_profile | jsonb
unit_version     | content_hash     | text
unit_version     | content_profile  | jsonb
unit_version     | provenance       | text
  • No column named source_hash, tac_source_hash, origin_hash, or similar exists on information_unit or unit_version.
  • unit_version.content_hash already carries the SHA-256 of the IU body (per Phase 3 finding fn_content_hash). Reusing it for TAC source provenance would conflate two different hash semantics.
  • The two available jsonb carriers are unit_version.content_profile and (less suitable) information_unit.identity_profile. unit_version.provenance is a free text column.
  • The Phase 4 design proposal content_profile.source_hash is design-only: no schema migration has reserved the key, no constraint enforces it, no trigger writes it. Result: hash_provenance_target=needs_design.

8. Critical blocker notes

  1. Vocab gap is the migration blocker, not a planner bug. IU vocab on the live system has 1 unit_kind / 1 section_type / 1 publication_type. To migrate the 86 TAC LU rows + 3 TAC publications without losing semantics, the IU side needs (at minimum) the 14 live-delta codes above seeded into dot_config. None of these are approved for execution by this discovery run.
  2. Species/composition pipeline is not wired for IU/UV. species_collection_map is empty for both COL-176 and COL-177; all 12 existing IU births carry species_code=NULL, composition_level=NULL. SPE-LAW (compound, governed, prefix=LAW) is available as a target species, but no mapping row links it to information_unit/unit_version. Migration must either seed the map or explicitly accept post-migration NULL species (with governance consequences).
  3. TAC tables are entirely outside the governance universe. tac_logical_unit, tac_unit_version, tac_publication, tac_publication_member have 0 rows in collection_registry, 0 rows in birth_registry, 0 rows in species_collection_map, and 0 rows in entity_labels / universal_edges. TAC→IU is not a transparent rebadge; it requires explicit governance pipeline wiring on the IU side or a documented deferral.
  4. Hash provenance target is not a real field. The design assumption content_profile.source_hash is not yet a reserved jsonb key or column. Phase 5 must lock the target (column, jsonb key path, or new traceability table) before any TAC→IU body copy is performed, or the TAC source-hash continuity will be lost.
  5. Parent-child policy required. TAC hierarchy snapshot is depth=2 with 3 roots / 83 children / 0 orphans. IU has parent_or_container_ref (uuid) but Phase 3 confirmed it is NULL on all 12 existing IU rows and not enforced at birth. Phase 5 must decide: replicate the parent_id structure in IU, materialise as universal_edges CONTAINS rows, both, or neither.
  6. collection_registry promotion gate. information_unit / unit_version are governance_role=observed, migration_state=pilot. If Phase 5 migration is to be authoritative, governance_role must be elevated to governed first — which currently is blocked by the species/composition/labels gaps above.

9. What is NOT in scope of this discovery

  • No DDL was executed. No vocab seeds, no species_collection_map seeds, no collection_registry updates, no trigger/function changes, no Directus or Qdrant mutations, no DOT-119 execution.
  • All counts in this report are snapshots taken on 2026-05-11 during the discovery run, not constants. They will change with system activity.
  • law_unit is reported as a proposed target unit_kind discovered from the active migration design (Phase 3 blocker), not as an approved seed value. Implementation prompt (DRAFT) remains BLOCKED until GPT/User approve the seed set explicitly.

10. Logs and evidence

  • Discovery log on VPS: /tmp/p3d-phase4-discovery-20260511-064954.log
  • Script on VPS: /tmp/p3d-phase4-discovery.sh
  • V6 delta SQL (clean re-run after shell-escape issue): /tmp/v6-delta.sql (host + container)

Two cosmetic execution issues in the initial pass — neither affected results:

  • V6 inline heredoc had backslashes consumed by the outer shell; the query was re-executed from a file (results above).
  • E2 DO-block RAISE NOTICE raised a quoting error (unrecognized exception condition) but the subsequent SELECT edge_type, count(*) ran successfully and produced the distribution shown.

11. Status block

phase4_discovery_status=PARTIAL
implementation_allowed=false
migration_allowed=false
seed_executed=false
mutation_performed=false
readonly_discovery_completed=true
no_hardcode_compliance=true
next_recommended_action=Route this report to GPT/User for explicit approval of (a) 14 vocab seed candidates, (b) species mapping for COL-176/COL-177, (c) parent_or_container_ref + composition_role policy, (d) hash-provenance target field. Implementation prompt (DRAFT) stays blocked until all four decisions land.
Back to Knowledge Hub knowledge/dev/laws/dieu44-trien-khai/reports/p3d-pack1-phase4-governance-vocab-discovery-report.md