05 — Scalable Governance Detection Architecture (2026-06-01)
05 — Scalable Governance Detection Architecture
Branch E. Design only — these views/models are proposed, not created. No DDL in this mission. Names are design handles.
5.1 Design constraints (from the user requirement)
- Scale to 10⁶–10⁸ objects without full-table scans in the UI.
- Memory-independent: rules live in PG/registry/DOT, never in agent recall.
- Incremental where possible (rescan only changed sources).
- Typed by object_type / source_model so a finding is attributable.
- Pivot/count-summarisable so Registries-Pivot shows coverage without loading rows (Điều 26 5-layer: L1 summary is cheap, L3 rows are on-demand).
- Feeds Registries-Pivot +
system_issues(Điều 31) +event_outbox(Điều 45). - No hardcoded per-object checks in UI (Điều 28 NT-D1).
Architectural reuse: this is the same shape as the live Điều 31 contract/runner model and the Registries-Pivot count-integrity views (v_registry_leaf_set → v_count_integrity → v_count_drift → pivot). The governance layer reuses that pipeline with a governance lens.
5.2 Six detection layers
L1 Governance source inventory (registry of registries — what sources exist)
│ feeds
L2 v_governed_object_candidates (every governed object, typed, from all sources)
│ resolve owner/links
L3 v_governance_coverage (each candidate + resolved link set + covered?)
│ filter missing links
L4 v_governance_orphans (only the gaps, typed + severity)
│ aggregate
L5 v_governance_coverage_summary (pivot: by type/source/owner/gap/severity)
│ route
L6 system_issues + event_outbox (issue per orphan; event per Đ45 register-before-emit)
Layer 1 — Governance source inventory
- Purpose: the registry of registries — the authoritative list of where governed objects live. Without it, "what is everything?" depends on memory (the exact failure mode to kill).
- Source:
meta_catalog,collection_registry/directus_collections, table registry /information_schema.tables,pivot_definitions,dot_tools,label_rules,taxonomy_facets,event_type_registry,governance_registry,normative_registry,design_templates,approval_requests/apr_action_types, routes/API list (from Nuxt/nginx config or a route registry if/when one exists), workflow/task tables, and future object registries (added as rows, not code — Điều 26 §0-AU "thêm dòng = INSERT, không sửa code"). - Key columns:
source_id, source_kind, object_type_produced, extraction_rule_ref, owner_resolution_rule_ref, last_scanned_at, row_estimate. - Scale: small (tens–hundreds of sources). This table is the scale lever: adding a new object class = adding one inventory row, and the rest of the pipeline covers it automatically.
- Refresh: on source-set change (rare); a tier-A DOT verifies no live table/registry is missing from the inventory (itself an orphan-of-inventory check).
- Owner: GOV-COUNCIL (it is cross-system policy). Approval/gate: new inventory row = APR (medium).
Layer 2 — v_governed_object_candidates
- Purpose: normalize every governed object from every L1 source into one typed stream.
- Source: L1 inventory drives per-source extraction (
UNION ALLof per-source SELECTs, each tagged withsource_id). - Key columns:
governed_object_type, governed_object_ref, source_id, source_model (A/B/file/pg/registry), parent_ref, risk_class, lifecycle_status, born_at. - Scale: this is the largest relation (≈ sum of all governed-object rows). Never materialised to the UI. Use:
- incremental: a
WHERE changed_since(last_scanned_at)predicate per source where the source has adate_updated/last_seen_at(most do); - partition by
source_model/governed_object_typeso policy objects (hundreds) are scanned every cycle while record-grade objects (millions) are sampled/aggregated.
- incremental: a
- Refresh: incremental per source cadence (doc 11 §cadence).
- Owner: GOV-SIV. Gate: read-only view; no approval.
Layer 3 — v_governance_coverage
- Purpose: for each candidate, resolve the governance link set (doc 02 §2.2) and compute
covered(doc 04 §4.3). - Source: L2 ⋈
governance_relations⋈law_jurisdiction⋈governance_registry⋈approval_requests(exceptions) ⋈dot_tools(dot_authority) ⋈ parent-inheritance walk. - Key columns:
governed_object_ref, governed_object_type, owner_gov_code, owner_path_kind (direct/relation/jurisdiction/exception/delegated/inherited), capability_ok, law_ref, approval_ref, audit_ref, rollback_ref, dot_authority_ref, covered (bool), required_links_missing (array). - Scale critical: owner resolution uses scalar/EXISTS lookups, never fan-out joins (lesson from RP: a naive
LEFT JOIN pivot_definitions ON source_objectfanned 160→172 rows and broke the invariant — Điều 28 double-count via SQL). Inheritance walk is bounded-depth recursive (the RP tree walks 37 nodes depth-3 cycle-free; the bound prevents runaway on 10⁸). - Refresh: follows L2.
- Owner: GOV-SIV. Gate: read-only.
Layer 4 — v_governance_orphans
- Purpose: the gap set — only candidates where
covered = false, each typed (doc 03 §3.2) and severity-graded (doc 03 §3.3). - Source:
SELECT … FROM v_governance_coverage WHERE NOT covered. - Key columns:
governed_object_ref, governed_object_type, gap_type, severity, source_id, source_model, owner_path_kind, detected_at, coalesce_key. coalesce_key: stable hash of(object_ref, gap_type)— reuses the livesystem_issues.coalesce_keyidempotency pattern so a persistent orphan does not create duplicate issues across scans (it bumpsoccurrence_count/last_seen_at).- Scale: small relative to L2 (the goal is for this to trend to zero for truth-class objects). Capped row materialization (doc 11 §threshold): above N orphans of one type, materialise the summary + top-N exemplars, and
log()the truncation — never silently cap. - Owner: GOV-SIV. Gate: read-only.
Layer 5 — v_governance_coverage_summary
- Purpose: the pivot that Registries-Pivot renders — coverage by
governed_object_type×source×owner×gap_type×severity, plus the four invariant terms (doc 04 §4.2) per scope. - Source:
v_governance_coverage+v_governance_orphans, grouped. Backed bypivot_definitions(Điều 26: counting ispivot_count()only; a grand-total bucket must be a constant-bucket VIEW, never an un-groupedcount(*)— RP PIV-500 lesson). - Key columns:
scope, group_values (jsonb), total_governed, covered, orphans, approved_exceptions, retired, coverage_pct, max_severity. - Scale: L1 summary is a few thousand grouped rows max → cheap to render. This is what makes "reflect coverage without loading millions of rows" true (Điều 26 L1/L2 cheap, L3 on demand).
- Owner: GOV-SIV (health) for the numbers; GOV-MOUT (render) for display. Gate: read-only;
pivot_definitionsINSERT = APR.
Layer 6 — Issue / event routing
- Purpose: turn orphans into governed signals.
- Source:
v_governance_orphans→system_issues(one row percoalesce_key, severity-routed) →event_outbox(one event per registeredevent_typeunder Điều 45 §3.2 register-before-emit; signal-not-data, Điều 45 §4 — the event carriesgoverned_object_ref, never payload). - Scale: issue creation is idempotent (coalesce); event emission is throttled (doc 11 §throttle) and aggregated (one
governance_coverage_degradedsummary event per scope per cycle, not one per orphan, for the high-volume case). - Owner: GOV-SIV emits; routing types owned per Điều 45. Gate: event types must be registered first (doc 07) — this is design, no emit happens in this mission.
5.3 Why this is not a per-object UI check (anti-Đ28-violation)
The UI (Registries-Pivot) renders only L5 (a pivot the backend computed), never L2/L3/L4 logic. There is no if governed_object_type == 'x' branch in Nuxt — the screen reads a registered design_template + a pivot_definitions-backed result (Điều 28 NT-D1/NT-D3, Điều 26 1C/1E). Drill-down from L5 → L4 exemplars → L3 object → its DB substrate is the same recursive contract as the RP drill-down (doc 09 / 04-dynamic-drilldown-layer-model.md).
5.4 Mapping to existing pipeline (reuse ledger)
| New layer | Reuses (live) |
|---|---|
| L1 inventory | meta_catalog (169) + check_registry_coverage + fn_birth_onboarding_full_scan |
| L2 candidates | RP v_registry_leaf_set shape (leaf-scoped, no meta double-count) |
| L3 coverage | RP v_count_integrity shape (scalar-subquery resolution) |
| L4 orphans | RP v_count_drift shape + system_issues.coalesce_key |
| L5 summary | pivot_definitions + refresh_pivot_results (statement-trigger) |
| L6 routing | system_issues + event_outbox + event_type_registry (Đ45) |
Cross-refs: doc 06 (the DOTs that run these views), doc 07 (the issue/event types L6 needs), doc 11 (scale/cadence/throttle/partition detail).