P3D Pack 1 Phase 5A — Semantic Registry Disambiguation Addendum
P3D Pack 1 Phase 5A — Semantic Registry Disambiguation Addendum
Date: 2026-05-11 | Author: Opus 4.7
A. Why Phase 5 dry-run returned PARTIAL
3 concepts in the semantic registry were defined too broadly, matching multiple columns on the same table. The deterministic resolution rule (>1 match = AMBIGUOUS_FIELD) correctly blocked join-dependent queries instead of silently choosing the wrong column.
B. Why this is a success
Before semantic registry: agent would have picked publication_id or logical_unit_id by memory/guess. If wrong → silent bad data in metrics/rollback. With registry: ambiguity caught, reported, agent stopped. GPT/User resolves. Zero risk of wrong-column bugs.
C. Ambiguity 1: publication_link overloaded
tac_publication_member is a JOIN TABLE with TWO FK-style ID columns:
publication_id→ points to tac_publicationlogical_unit_id→ points to tac_logical_unit
Original concept publication_link with candidates publication_id, publication_ref, pub_id, document_id, logical_unit_id, unit_id matched BOTH. These are semantically different: one is "which publication" and the other is "which logical unit."
Fix: Split into publication_ref (candidates: publication_id, publication_ref, pub_id, document_id) and logical_unit_ref (candidates: logical_unit_id, unit_id, lu_id, logical_unit_ref).
D. Ambiguity 2: provenance_profile overloaded
unit_version has TWO columns matching provenance-like candidates:
content_profile(jsonb) — structured metadata containerprovenance(text) — free-text provenance note
Phase 4B addendum §7 contract says content_profile.source_hashes.tac_v1 is the carrier for structured TAC hash provenance. The text provenance field serves a different purpose (editor notes or origin description).
Fix: Split into provenance_json_profile (candidates: content_profile, profile, metadata_profile) and provenance_text_note (candidates: provenance, provenance_note, origin).
E. Absent sort_order → render_order
tac_publication_member has render_order for intra-publication ordering, but the sort_order concept's candidate list didn't include it. Agent correctly reported FIELD_ABSENT + UNREGISTERED_FIELD.
Fix: Add new concept publication_render_order (candidates: render_order, pub_order, display_order, render_sequence).
F. Collection key ambiguity (informational)
collection_registry has both name (display label) and collection_name (PG table name / FK key). The collection_key concept matched both. This didn't block any critical goal but should be disambiguated for correctness.
Fix: Split into collection_table_key (candidates: collection_name, table_name, collection) and collection_display_name (candidates: name, display_name, title, name_en).
G. Revised semantic concepts
| Old concept | Split into | Candidate labels |
|---|---|---|
publication_link |
publication_ref |
publication_id, publication_ref, pub_id, document_id |
logical_unit_ref |
logical_unit_id, unit_id, lu_id, logical_unit_ref |
|
| (new) | publication_render_order |
render_order, pub_order, display_order, render_sequence |
provenance_profile |
provenance_json_profile |
content_profile, profile, metadata_profile |
provenance_text_note |
provenance, provenance_note, origin |
|
collection_key |
collection_table_key |
collection_name, table_name, collection |
collection_display_name |
name, display_name, title, name_en |
H. Rerun scope after patch
Recommend full G1-G11 rerun (not selective):
- Read-only → zero risk
- Registry change may affect resolutions on other tables (e.g.,
collection_table_keyreplacescollection_keyin birth_registry/species_collection_map queries) - One clean pass produces a complete report, not a patchwork of old + new evidence
- Simpler for agent, cleaner for GPT review
Phase 5A Addendum | Registry disambiguation | 4 overloaded concepts split | 2026-05-11