KB-7A27

11 — Scale Strategy for Governance Coverage (10^8 objects) (2026-06-01)

9 min read Revision 1

one-roof-governancescale-strategyincremental-scanpartitioningpivot-summarythrottlingcadenceaudit-retentiondieu26dieu452026-06-01

11 — Scale Strategy

Branch K. How One-Roof governance coverage scales to millions–hundreds of millions of objects without full-table scans in the UI and without memory dependence.

11.1 Source partitioning by object_type / source_model

The L1 source inventory (doc 05) partitions the universe by (governed_object_type, source_model). Each partition is scanned independently with its own cadence and its own owner-resolution rule.
Two cost tiers:
- Policy/authority tier — agencies, laws, DOTs, pivots, routes, policy tables, event types. Cardinality 10²–10⁴. Scanned fully, every cycle — this is where truth/authority lives and where GOVERNANCE_COVERAGE_PASS must be exact.
- Record/substrate tier — registry rows, entities, labels (10⁶–10⁸). Never scanned row-by-row for governance; coverage inherits from the parent registry/collection (doc 04 §4.4). A record is governance-covered iff its registry is covered. So 10⁸ records collapse to a few thousand registry-coverage checks.
This is the central scale lever: governance coverage is resolved at the owning-container grain, not the leaf-record grain. (The same leaf-vs-meta discipline that kept RP count-integrity correct.)

11.2 Incremental scan

Each source with a date_updated / last_seen_at (most live tables have one) is scanned with WHERE changed_since(last_scanned_at). Unchanged partitions are skipped; their prior coverage verdict stands.
A full reconciliation runs on a slow cadence (e.g. weekly) to catch deletes/renames the incremental pass misses — the same reason Điều 31 keeps a periodic full audit alongside event-driven checks.
Event-driven option: subscribe to existing iu.*/piece.*/staging.*/mother.* events (Điều 45) so a governed object's birth/retire triggers a targeted re-scan of just that object — incremental at the object grain.

11.3 Scan cadence

Tier	Cadence	Rationale
policy/authority full scan	daily (cron)	small set; must be exact for the gate (reuse `DOT-GOV-LAW-HEALTH` "cron daily" pattern)
record/substrate inherited check	per registry change + weekly full	inheritance makes per-record scanning unnecessary
WATCHDOG self-audit	per cycle	Điều 31 Nguyên tắc 6; silence = critical
full reconciliation	weekly	catch deletes/renames (reuse `DOT-GOV-CONFLICT` "cron weekly")

11.4 Pivot summaries (the render contract)

Registries-Pivot renders only L5 v_governance_coverage_summary — a grouped pivot (pivot_definitions-backed, Điều 26). Grouped result cardinality is bounded (object_type × source × owner × gap_type × severity ≈ low thousands), independent of the 10⁸ population.
Grand-total / per-group counts are constant-bucket VIEWs, never un-grouped count(*) over the base population (RP PIV-500 engine finding: the no-group branch hardcodes count(*) and ignores the metric — a grand total must be a grouped constant bucket).
L1 SUMMARY + L2 GROUP LIST are CỨNG (always present, cheap); L3 ENTITY LIST + L4 EXEMPLARS are MỀM (loaded on drill-down only) (Điều 26 §II-QUATER).

11.5 Issue aggregation

One system_issues row per (object_ref, gap_type) via coalesce_key — persistent orphans bump occurrence_count/last_seen_at, they do not multiply rows.
High-cardinality gap classes (a source with millions of uncovered records) produce one aggregate issue (pivot_coverage_unowned for the source) + top-N exemplars, never one issue per record. (Lesson: template_gap already has 181,378 rows — governance must not replicate that volume; it summarizes at the container grain.)

11.6 Threshold for detailed row materialization

Below a governed threshold N (e.g. N=500, matching the live query_pg hard LIMIT and a sane UI page), L4 materializes actual orphan rows.
Above N, L4 materializes summary + top-N exemplars and log()s the truncation (count dropped) — never silently caps (a silent cap reads as "all covered" when it isn't). The summary count remains exact (from L5 aggregation); only the row enumeration is bounded.

11.7 Top-level summary vs drill-down

Default view = summary only (L5). No base-population rows loaded.
Drill-down on demand: summary cell → L4 exemplars (bounded) → L3 single object → its DB substrate. Each step is a separate bounded query (the proven RP recursive drill contract). The UI never holds more than one page of rows.

11.8 Avoiding full table scans in the UI

The UI issues no aggregate query over base tables — it reads pre-computed L5 (refreshed by the scanner DOT via statement-trigger refresh_pivot_results, as RP already does). UI → API → pivot_results/summary view; never UI → base table count.
This also enforces Điều 28 NT-D1 (no business logic / no DB query truth in Nuxt) — the UI literally cannot scan.

11.9 Stale scan handling

Every summary row carries last_scanned_at. If a partition's scan is older than k×cadence (Điều 45 §15.4: >3× = warning, >10× = critical), the UI shows the data as STALE and the WATCHDOG raises governance_scan_stale. Stale ≠ covered: a stale partition cannot satisfy the production gate.
silent_gap_is_a_health_violation (Điều 45 §15.5) — a partition that simply stopped being scanned is a named, attributable fault, not invisible.

11.10 Audit retention

Coverage scan runs and assignment applies are append-only to vps_deploy_log / governance_audit_log / system_issues (Luật Bảo toàn — retire, don't delete).
Retention policy: keep per-run summaries indefinitely (small); prune raw per-object exemplar snapshots after a governed window (they are reconstructible from a re-scan). Resolved issues archive (issue_archived), not delete.

11.11 Notification throttling

Per Điều 45 §15.4 thresholds and doc 07 §7.4: info no notify; warning batched digest; high per-owner queue; critical immediate. A mass-orphan event emits one summary event per scope per cycle (governance_coverage_degraded), not per orphan. Notification fan-out via event_subscription (3 live rows) — subscribers, not broadcast.

11.12 How Registries-Pivot reflects coverage without loading millions of rows

RP loads exactly one L5 summary payload (low-thousands grouped rows, often far fewer) + drill-down pages on demand. The coverage percentage, orphan counts, and gap breakdown are all in the summary. The 10⁸ base population is never transferred to the browser — the same way RP already serves a 1.73MB page from pivot summaries, not raw rows.

11.13 How future object types are auto-covered (the memory-independence guarantee)

A new object class is covered by adding one L1 source-inventory row (source_kind, object_type_produced, extraction_rule_ref, owner_resolution_rule_ref) — data, not code (Điều 26 §0-AU "thêm dòng = INSERT, không sửa code").
From that row, L2 extracts it, L3 resolves coverage, L4 surfaces gaps, L5 pivots it, L6 routes issues — with no edit to the pipeline, the UI, or any agent's memory. The first instance of a new class with no owner is immediately a visible OWNER_GAP.
This is the concrete mechanism behind the user requirement "future expansion must be automatically covered" and "detect anything outside governance automatically, like orphan detection." The detector does not need to be told a new thing exists; it discovers it from the source inventory and demands its owner.

11.14 Scale failure modes & guards

Failure	Guard
fan-out join explodes cardinality	scalar/EXISTS resolution only (doc 05 §L3); RP 160→172 lesson
recursive inheritance cycles	bounded-depth, cycle-detection (RP walk depth-3 cycle-free)
event flood	aggregate to summary event; throttle (§11.11)
silent cap hides gaps	`log()` truncation; exact summary count (§11.6)
scanner dies silently	WATCHDOG + `governance_scan_stale` (§11.9)
UI scans base tables	UI reads L5 only; Điều 28 NT-D1 (§11.8)

Cross-refs: doc 05 (the layers this scales), doc 06 (scanner cadence), doc 07 (aggregation/throttle), doc 04 (inheritance).