dot-iu-cutter v0.5 — Information-Unit Label/Metadata Registry Master Design (DESIGN ONLY) (2026-05-17)
dot-iu-cutter v0.5 — Information-Unit Label/Metadata Registry Master Design
Date: 2026-05-17
Phase: v0_5_constitution_hardtest_and_information_unit_factory_master_plan
Nature: DESIGN ONLY. No registry table is created. No label is registered.
Parent: dot-iu-cutter-v0.5-constitution-hardtest-master-plan-2026-05-17.md
1. Why this layer exists
At millions of IUs, labels/metadata must grow without schema migrations and without hardcoded label values or metadata keys in runtime. The prior foundation review already ruled: dictionary registry + append-only assignment + metadata-key registry + hot-key promotion ledger are required before large-scale labeling; SQL stays SSOT; JSONB is allowed for sparse metadata but is not hidden authority. This document is the master design for that layer.
2. Entities (design — NOT created)
label_dictionary:
label_id: text PK
label_namespace: text # e.g. "legal","status","topic" (registry, not literal)
label_key: text # canonical key
display_vi / display_en: text
cardinality_policy: text # single | multi (per IU)
mutability_policy: text # immutable | append_only | reassignable
lifecycle: text # proposed | active | deprecated
created_by / created_at
label_assignment: # APPEND-ONLY
assignment_id: text PK
iu_id: text # -> IU identity (canonicalization doc)
label_id: text FK
assigned_by: text # human|agent|rule_ref
assigned_at: timestamptz
retracted_at: timestamptz NULL # retraction is a new row state, not a delete
provenance_ref: text # rule/decision that produced it
# mutation = new row; never UPDATE/DELETE history (matches append-only ledger P2)
metadata_key_registry:
metadata_key: text PK
value_type: text # text|int|bool|ts|enum_ref|json
cardinality_policy: text
mutability_policy: text
index_policy: text # none | promoted_sql | promoted_index | gin
hot_threshold_hint: text # guidance for promotion
lifecycle: text
metadata_value: # sparse, may live as JSONB on IU OR as rows
iu_id: text
metadata_key: text FK
value: (typed) # JSONB-backed for cold/sparse keys
set_at: timestamptz
hot_key_promotion_ledger: # APPEND-ONLY governance of promotions
promotion_id: text PK
metadata_key: text FK
from_storage: text # jsonb
to_storage: text # promoted_sql_column | promoted_index | registry_assignment
reason: text # query frequency / latency evidence
approved_by: text # sovereign gate
promoted_at: timestamptz NULL # null until executed in a later authorized cycle
All of the above are proposals for a later schema-design + DDL cycle (Q5). Nothing is created here.
3. Assignment model
assignment_model:
storage: append-only label_assignment rows (no UPDATE/DELETE)
retraction: insert row with retracted_at set; current = latest non-retracted
cardinality: enforced by label_dictionary.cardinality_policy at write-time
(single => at most one active assignment per (iu_id,label_namespace))
provenance: every assignment records the rule/decision that produced it
determinism: rule-produced labels must be reproducible from (iu_id, rule_ref, version)
4. Mutability & cardinality policy
mutability:
immutable: identity-like labels (e.g. source_document_ref-derived) never change
append_only: status/topic labels evolve by new rows, history preserved
reassignable: only via retract+reassign, both rows kept (audit)
cardinality:
single: e.g. authority_status (one active)
multi: e.g. topic tags
policy lives in label_dictionary, NOT in runtime code
5. Hot-key promotion (JSONB → SQL/indexed)
promotion_flow:
cold: new/sparse metadata keys live in JSONB metadata_value (no migration)
observe: query frequency + latency tracked (read-only telemetry, design)
propose: when a key crosses hot_threshold_hint, emit hot_key_promotion_ledger
row (state=proposed) — NO automatic DDL
sovereign_gate: GPT/User approves promotion
execute_later: a SEPARATE authorized index/DDL cycle promotes the key to a
typed column or index; ledger.promoted_at set on success
invariants:
- JSONB is never the hidden authority: a promoted key's SQL column/index becomes
the queried path; JSONB retained only as cold mirror or dropped per policy
- promotion is forward-only and audited; no silent schema drift
This directly satisfies foundation-review requirement
frequently_queried_keys_must_be_promoted_to_indexed_SQL_or_registry.
6. Anti-hardcoding (binding)
no_hardcode:
- label values: resolved via label_dictionary, never literal strings in runtime
- metadata keys: resolved via metadata_key_registry, never literal keys in runtime
- cardinality/mutability/index policy: registry columns, not code branches
- promotion thresholds: registry hints + sovereign approval, not inline constants
7. Constitution hardtest application
constitution_labels (illustrative, registry-driven, NOT created):
- namespace=authority: enacted | controlled_draft (from ✅/📋 markers)
- namespace=structure: nguyen_tac | kien_truc_section | dieu
- namespace=lineage: source_document_ref, document_version_id (immutable)
note: these become rows in label_dictionary in a later cycle; here they only
validate that the registry shape covers the hardtest's needs
8. Open decisions
open_decisions:
OD-L1: label_assignment as table vs reuse append-only ledger pattern in cutter_governance
OD-L2: metadata_value JSONB-on-IU vs separate EAV-style rows (or both, per key)
OD-L3: cardinality enforcement mechanism (app-level vs partial unique index later)
OD-L4: hot-key promotion threshold policy + telemetry source
OD-L5: ownership schema (cutter_governance vs new schema) + role grants
9. Do not run yet
No registry table created, no label registered, no metadata key registered, no JSONB write, no promotion executed, no index DDL, no schema migration, no code change. Design only. Forbidden list = master plan §10.
10. Git
git: { branch: main, HEAD: e93424b5ff7fa5e4b8406131977ce4339cd0856a,
status_short_iu_cutter: clean, code_changed: false, commit_made: false }