KB-6CF7

P3D Pack 1 Phase 5A — Semantic Registry Disambiguation Addendum

5 min read Revision 1
p3dpack1phase5asemantic-registrydisambiguationaddendum

P3D Pack 1 Phase 5A — Semantic Registry Disambiguation Addendum

Date: 2026-05-11 | Author: Opus 4.7


A. Why Phase 5 dry-run returned PARTIAL

3 concepts in the semantic registry were defined too broadly, matching multiple columns on the same table. The deterministic resolution rule (>1 match = AMBIGUOUS_FIELD) correctly blocked join-dependent queries instead of silently choosing the wrong column.

B. Why this is a success

Before semantic registry: agent would have picked publication_id or logical_unit_id by memory/guess. If wrong → silent bad data in metrics/rollback. With registry: ambiguity caught, reported, agent stopped. GPT/User resolves. Zero risk of wrong-column bugs.

tac_publication_member is a JOIN TABLE with TWO FK-style ID columns:

  • publication_id → points to tac_publication
  • logical_unit_id → points to tac_logical_unit

Original concept publication_link with candidates publication_id, publication_ref, pub_id, document_id, logical_unit_id, unit_id matched BOTH. These are semantically different: one is "which publication" and the other is "which logical unit."

Fix: Split into publication_ref (candidates: publication_id, publication_ref, pub_id, document_id) and logical_unit_ref (candidates: logical_unit_id, unit_id, lu_id, logical_unit_ref).

D. Ambiguity 2: provenance_profile overloaded

unit_version has TWO columns matching provenance-like candidates:

  • content_profile (jsonb) — structured metadata container
  • provenance (text) — free-text provenance note

Phase 4B addendum §7 contract says content_profile.source_hashes.tac_v1 is the carrier for structured TAC hash provenance. The text provenance field serves a different purpose (editor notes or origin description).

Fix: Split into provenance_json_profile (candidates: content_profile, profile, metadata_profile) and provenance_text_note (candidates: provenance, provenance_note, origin).

E. Absent sort_order → render_order

tac_publication_member has render_order for intra-publication ordering, but the sort_order concept's candidate list didn't include it. Agent correctly reported FIELD_ABSENT + UNREGISTERED_FIELD.

Fix: Add new concept publication_render_order (candidates: render_order, pub_order, display_order, render_sequence).

F. Collection key ambiguity (informational)

collection_registry has both name (display label) and collection_name (PG table name / FK key). The collection_key concept matched both. This didn't block any critical goal but should be disambiguated for correctness.

Fix: Split into collection_table_key (candidates: collection_name, table_name, collection) and collection_display_name (candidates: name, display_name, title, name_en).

G. Revised semantic concepts

Old concept Split into Candidate labels
publication_link publication_ref publication_id, publication_ref, pub_id, document_id
logical_unit_ref logical_unit_id, unit_id, lu_id, logical_unit_ref
(new) publication_render_order render_order, pub_order, display_order, render_sequence
provenance_profile provenance_json_profile content_profile, profile, metadata_profile
provenance_text_note provenance, provenance_note, origin
collection_key collection_table_key collection_name, table_name, collection
collection_display_name name, display_name, title, name_en

H. Rerun scope after patch

Recommend full G1-G11 rerun (not selective):

  1. Read-only → zero risk
  2. Registry change may affect resolutions on other tables (e.g., collection_table_key replaces collection_key in birth_registry/species_collection_map queries)
  3. One clean pass produces a complete report, not a patchwork of old + new evidence
  4. Simpler for agent, cleaner for GPT review

Phase 5A Addendum | Registry disambiguation | 4 overloaded concepts split | 2026-05-11

Back to Knowledge Hub knowledge/dev/laws/dieu44-trien-khai/design/p3d-pack1-phase5a-semantic-registry-disambiguation-addendum.md