KB-5878

04 — Axis Storage Model: PG-First (born vs candidate, axis_assignment, reuse map, design-only, 2026-06-02)

11 min read Revision 1
one-roof-governanceaxispg-firstbirthcandidateaxis_assignmententity_labelstaxonomyuniversal_edgesconfidenceprovenance3-zonedesign-only2026-06-01

04 — Axis Storage Model: PG-First (Branch D)

Branch D. Where does truth live? PG is truth and runtime. Born objects are the stable, referenced, governed nodes; candidates are uncertain proposals; projections (envelope/tree/Qdrant) are read-only derivations. Minimal new substrate: axis_assignment (the G3 confidence/evidence/zone gap) + reuse of everything else. Verdict: RECOMMENDED.


04.0 Principle (PG-first, projection-last)

  • PG (PostgreSQL 16) is the single source of truth and the runtime (Điều 39 NT8: "PG→AGE→Neo4j NEVER"; vector/KG are projections). Directus is the API/edit surface, Nuxt the screen (PG→Directus→Nuxt; no direct PG from Nuxt).
  • Born ≠ candidate ≠ projection. A born object has a Birth-Registry identity and is referenceable; a candidate is an uncertain proposal living in a working zone; a projection (iu_three_axis_envelope, iu_tree_path, Qdrant, v_ui_*) is a derived read model that must be recomputable and never authored directly.

04.1 Which pieces go through Birth (Điều 0-G / Điều 19)

Object Through Birth? Why
Axis (a row in axis_registry) Yes — birth → register → active (M-DEF-9 lifecycle) it is a governed registry object; other things reference axis_code
Topic node (a taxonomy row, FAC-08) once approved & active Yes it is a stable referenced classifier; IUs/entities point to it; it must be governed & coverable
Information Unit Yes (already born) first-class governed object
Topic candidate (proposed, unapproved) No uncertain; lives as taxonomy.status='candidate' or inside an approval_requests proposal until promoted
Axis relation (iu_relation/universal_edges edge) No (not a birth object) a relation is provenance+valid_time, not a born entity (Điều 38 §4.2)
Axis assignment (entity↔node) No membership fact, carries confidence/zone; not a born identity
Projection rows (envelope/tree/vector) No derived; recomputable

Rule: identity-bearing, referenced classifiers are born; facts/edges/memberships/projections are not.

04.2 Candidate → born promotion

A node moves candidate → provisional → active/born when:

  1. it has passed review (human-requested = highest priority; AI-proposed needs human/governance approval; KG-provisional stays AI-facing) — doc 06;
  2. an L2 approval is recorded (register_topic_node / register_axis) — doc 02;
  3. it is to be made UI-visible / referenceable (the governance-coverage trigger — doc 09).

Promotion = an L3 build (it COMMITs a born taxonomy/axis_registry row + a birth_registry entry). Candidates that never promote remain in the working zone and are governed only as input quality (doc 09), never as governance orphans.

04.3 Is the official topic object born? — Yes

An approved, UI-governed topic is a born object: a taxonomy row (facet_id=FAC-08, status='active') plus a birth_registry entry, owned (GOV-KG-SYS substrate / GOV-COUNCIL policy), covered, and APR-gated for edits. A provisional/candidate topic is not born — it is status∈{candidate,provisional}, AI-/review-facing only, and excluded from the official UI tree until promoted.

04.4 How topic relations are stored

  • Topic hierarchy (broader/narrower): taxonomy.parent_id + parent_facet (DAG-capable) — deterministic vocabulary structure, APR-gated.
  • Topic↔topic semantic (related / see-also / cross-facet): universal_edges (source_collection='taxonomy', edge_type∈{RELATED,BROADER,NARROWER}, with confidence/provenance/valid_time) — and, once built, entity_relations for soft semantic relations (synonym/subsumption/contradiction, NĐ-36-01 §MT2).
  • Supersession (merge/split): taxonomy.replaced_by + iu_merge_set/iu_split_set-style change records (doc 06).

04.5 How IU↔topic assignment is stored

Topic assignment is uncertain ⇒ it must carry confidence/evidence/zone, which entity_labels cannot. Options:

  • (a) New axis_assignment (recommended SoT for semantic axes) — §04.10.
  • (b) iu_metadata_tag — already has confidence+enrichment_source but is IU-only and lacks zone/evidence/valid_time.
  • (c) entity_labels — keep for the existing rule-driven deterministic label facets; it has no confidence/zone.

Recommendation: axis_assignment is the source-of-truth working layer for uncertain axis membership (topic, expertise) with the 3-zone discipline; entity_labels remains the materialized approved projection for rule-labels (and is reconciled/backfilled from axis_assignment for the topic facet). iu_metadata_tag is folded into axis_assignment over time (a patch — doc 13).

04.6 How containment / reconstruction assignment is stored

These are deterministic — assignment is intrinsic, not a separate uncertain fact:

  • Containment: primary parent = information_unit.parent_or_container_ref; full graph = iu_relation (relation_type='contains', multi-parent allowed); materialized closure = iu_tree_path (path_ids[], depth, sibling_order, path_hash).
  • Reconstruction / source order: information_unit.doc_code + section_code + sort_order; rebuilt by fn_iu_reconstruct_source(doc_code) (non-exemptable invariant, GOV-SIV) — not stored as a separate assignment; it is derivable and fingerprintable.

04.7 Many-to-many & multi-parent

  • Containment multi-parent: the graph (iu_relation contains) permits multiple parents; the tree projection (iu_tree_path) designates one primary parent (information_unit.parent_or_container_ref) for UI. Graph = truth; tree = projection (doc 07).
  • Topic many-to-many: an entity may carry multiple topic assignments; facet cardinality (single/multiple) + max_labels_per_entity (governed row) bound it. FAC-08 is currently single — a council decision (L2) can widen it; the cardinality is data, not code.

04.8 Evidence / confidence / provenance

Carried natively where it matters:

  • iu_relation: confidence, evidence jsonb, provenance jsonb, assertion_mode, valid_time.
  • universal_edges: confidence, weight, provenance jsonb, valid_time.
  • iu_metadata_tag: confidence, enrichment_source.
  • axis_assignment (new): match_score (technical) separated from approval_state/zone (governance) per NĐ-36-01 §MT4 ("Điểm kỹ thuật cao không tự động = approved"); plus calibrated confidence, evidence, provenance, bitemporal valid_time+revision. Missing provenance ⇒ quarantine (Điều 39 A8). Confidence below threshold ⇒ candidate (NĐ-36-01: "không chắc = candidate").

04.9 Reuse map (existing relation/KG/taxonomy/label tables)

Need Reuse (live) Notes
IU↔IU axis relations iu_relation already evidence/provenance/bitemporal; only contains populated
any↔any graph relations universal_edges live KG (USES/BELONGS_TO/CONTAINS); the cross-collection store
soft semantic relations entity_relations unbuilt (NĐ-36-01) — interim on universal_edges
topic vocabulary + hierarchy taxonomy + taxonomy_facets + taxonomy_matrix FAC-08 topic facet exists; status + replaced_by
rule-driven labels label_rules + entity_labels keep for deterministic facets
IU tags w/ confidence iu_metadata_tag + registry fold into axis_assignment (patch)
containment closure / reconstruction iu_tree_path, envelope axis_a/axis_c projections
pivot axes pivot_definitions Điều 26
KG config/quality kg_* family trust, auto-approve rules, quality
coverage ledger collection_registry.coverage_*, meta_catalog doc 09

04.10 Minimal new substrate (the only two new tables)

  1. axis_registry — doc 03 §03.2 (the M-DEF-9 nine-attribute Registry; the critical inventory_gap fix).
  2. axis_assignment — the semantic/uncertain membership store (G3):
axis_assignment   (uncertain axis membership; 3-zone; bitemporal)
  id                  uuid PRIMARY KEY
  axis_code           text NOT NULL          -- → axis_registry
  entity_kind         text NOT NULL          -- information_unit | born_entity | collection
  entity_ref          text NOT NULL          -- canonical_address / code / uuid
  node_ref            text NOT NULL          -- → taxonomy.code (the axis node)
  zone                text NOT NULL          -- approved | candidate | quarantine  (NĐ-36-01 MT4)
  match_score         numeric                 -- technical score
  confidence          numeric                 -- calibrated; < threshold ⇒ candidate
  evidence            jsonb
  provenance          jsonb                   -- {source, extraction_method, resolved_by, timestamp}; missing ⇒ quarantine
  assigned_by         text                    -- dot | agent | human | rule
  rule_ref            integer NULL            -- → label_rules
  review_decision_ref text NULL               -- → approval_requests / doc_reviews when promoted
  valid_time          tstzrange               -- effective_from / effective_to (temporal, NĐ-36-01)
  revision            integer NOT NULL DEFAULT 1
  superseded_by       uuid NULL
  status              text NOT NULL           -- active | superseded | retracted
  created_by          text NOT NULL
  created_at          timestamptz NOT NULL

Both tables are additive, governed (registry edits = L2/L3 per doc 02), no-hardcode (vocabularies/thresholds are rows), and non-island (referenced by axis_registry, governance_registry owner, coverage ledger). Note: the GCOS substrates SB-10/SB-12/SB-13 (implementation-index docs 38–44) handle candidate-scan-state, snapshot/ruleset, and worker-cursor — axis_assignment is the content layer those operate over.

04.11 Verdict

RECOMMENDED. Truth lives in PG: born = axis_registry rows + active taxonomy nodes + IUs; candidate = taxonomy.status∈{candidate,provisional} and axis_assignment.zone∈{candidate,quarantine}; projection = envelope/tree/Qdrant/v_ui_*. Two new additive tables, everything else reuse. Promotion is an L3 build (doc 02). Sync/birth-promotion mechanics in doc 11.

Back to Knowledge Hub knowledge/dev/reports/architecture/one-roof-axis-proposal-authorization-operating-substrate-design-2026-06-01/04-axis-storage-model-pg-first.md