KB-7A27
11 — Scale Strategy for Governance Coverage (10^8 objects) (2026-06-01)
9 min read Revision 1
one-roof-governancescale-strategyincremental-scanpartitioningpivot-summarythrottlingcadenceaudit-retentiondieu26dieu452026-06-01
11 — Scale Strategy
Branch K. How One-Roof governance coverage scales to millions–hundreds of millions of objects without full-table scans in the UI and without memory dependence.
11.1 Source partitioning by object_type / source_model
- The L1 source inventory (doc 05) partitions the universe by
(governed_object_type, source_model). Each partition is scanned independently with its own cadence and its own owner-resolution rule. - Two cost tiers:
- Policy/authority tier — agencies, laws, DOTs, pivots, routes, policy tables, event types. Cardinality 10²–10⁴. Scanned fully, every cycle — this is where truth/authority lives and where
GOVERNANCE_COVERAGE_PASSmust be exact. - Record/substrate tier — registry rows, entities, labels (10⁶–10⁸). Never scanned row-by-row for governance; coverage inherits from the parent registry/collection (doc 04 §4.4). A record is governance-covered iff its registry is covered. So 10⁸ records collapse to a few thousand registry-coverage checks.
- Policy/authority tier — agencies, laws, DOTs, pivots, routes, policy tables, event types. Cardinality 10²–10⁴. Scanned fully, every cycle — this is where truth/authority lives and where
- This is the central scale lever: governance coverage is resolved at the owning-container grain, not the leaf-record grain. (The same leaf-vs-meta discipline that kept RP count-integrity correct.)
11.2 Incremental scan
- Each source with a
date_updated/last_seen_at(most live tables have one) is scanned withWHERE changed_since(last_scanned_at). Unchanged partitions are skipped; their prior coverage verdict stands. - A full reconciliation runs on a slow cadence (e.g. weekly) to catch deletes/renames the incremental pass misses — the same reason Điều 31 keeps a periodic full audit alongside event-driven checks.
- Event-driven option: subscribe to existing
iu.*/piece.*/staging.*/mother.*events (Điều 45) so a governed object's birth/retire triggers a targeted re-scan of just that object — incremental at the object grain.
11.3 Scan cadence
| Tier | Cadence | Rationale |
|---|---|---|
| policy/authority full scan | daily (cron) | small set; must be exact for the gate (reuse DOT-GOV-LAW-HEALTH "cron daily" pattern) |
| record/substrate inherited check | per registry change + weekly full | inheritance makes per-record scanning unnecessary |
| WATCHDOG self-audit | per cycle | Điều 31 Nguyên tắc 6; silence = critical |
| full reconciliation | weekly | catch deletes/renames (reuse DOT-GOV-CONFLICT "cron weekly") |
11.4 Pivot summaries (the render contract)
- Registries-Pivot renders only L5
v_governance_coverage_summary— a grouped pivot (pivot_definitions-backed, Điều 26). Grouped result cardinality is bounded (object_type × source × owner × gap_type × severity ≈ low thousands), independent of the 10⁸ population. - Grand-total / per-group counts are constant-bucket VIEWs, never un-grouped
count(*)over the base population (RP PIV-500 engine finding: the no-group branch hardcodescount(*)and ignores the metric — a grand total must be a grouped constant bucket). - L1 SUMMARY + L2 GROUP LIST are CỨNG (always present, cheap); L3 ENTITY LIST + L4 EXEMPLARS are MỀM (loaded on drill-down only) (Điều 26 §II-QUATER).
11.5 Issue aggregation
- One
system_issuesrow per(object_ref, gap_type)viacoalesce_key— persistent orphans bumpoccurrence_count/last_seen_at, they do not multiply rows. - High-cardinality gap classes (a source with millions of uncovered records) produce one aggregate issue (
pivot_coverage_unownedfor the source) + top-N exemplars, never one issue per record. (Lesson:template_gapalready has 181,378 rows — governance must not replicate that volume; it summarizes at the container grain.)
11.6 Threshold for detailed row materialization
- Below a governed threshold
N(e.g. N=500, matching the livequery_pghard LIMIT and a sane UI page), L4 materializes actual orphan rows. - Above
N, L4 materializes summary + top-N exemplars andlog()s the truncation (count dropped) — never silently caps (a silent cap reads as "all covered" when it isn't). The summary count remains exact (from L5 aggregation); only the row enumeration is bounded.
11.7 Top-level summary vs drill-down
- Default view = summary only (L5). No base-population rows loaded.
- Drill-down on demand: summary cell → L4 exemplars (bounded) → L3 single object → its DB substrate. Each step is a separate bounded query (the proven RP recursive drill contract). The UI never holds more than one page of rows.
11.8 Avoiding full table scans in the UI
- The UI issues no aggregate query over base tables — it reads pre-computed L5 (refreshed by the scanner DOT via statement-trigger
refresh_pivot_results, as RP already does). UI → API →pivot_results/summary view; never UI → base table count. - This also enforces Điều 28 NT-D1 (no business logic / no DB query truth in Nuxt) — the UI literally cannot scan.
11.9 Stale scan handling
- Every summary row carries
last_scanned_at. If a partition's scan is older thank×cadence(Điều 45 §15.4: >3× = warning, >10× = critical), the UI shows the data as STALE and the WATCHDOG raisesgovernance_scan_stale. Stale ≠ covered: a stale partition cannot satisfy the production gate. silent_gap_is_a_health_violation(Điều 45 §15.5) — a partition that simply stopped being scanned is a named, attributable fault, not invisible.
11.10 Audit retention
- Coverage scan runs and assignment applies are append-only to
vps_deploy_log/governance_audit_log/system_issues(Luật Bảo toàn — retire, don't delete). - Retention policy: keep per-run summaries indefinitely (small); prune raw per-object exemplar snapshots after a governed window (they are reconstructible from a re-scan). Resolved issues archive (
issue_archived), not delete.
11.11 Notification throttling
- Per Điều 45 §15.4 thresholds and doc 07 §7.4:
infono notify;warningbatched digest;highper-owner queue;criticalimmediate. A mass-orphan event emits one summary event per scope per cycle (governance_coverage_degraded), not per orphan. Notification fan-out viaevent_subscription(3 live rows) — subscribers, not broadcast.
11.12 How Registries-Pivot reflects coverage without loading millions of rows
- RP loads exactly one L5 summary payload (low-thousands grouped rows, often far fewer) + drill-down pages on demand. The coverage percentage, orphan counts, and gap breakdown are all in the summary. The 10⁸ base population is never transferred to the browser — the same way RP already serves a 1.73MB page from pivot summaries, not raw rows.
11.13 How future object types are auto-covered (the memory-independence guarantee)
- A new object class is covered by adding one L1 source-inventory row (
source_kind,object_type_produced,extraction_rule_ref,owner_resolution_rule_ref) — data, not code (Điều 26 §0-AU "thêm dòng = INSERT, không sửa code"). - From that row, L2 extracts it, L3 resolves coverage, L4 surfaces gaps, L5 pivots it, L6 routes issues — with no edit to the pipeline, the UI, or any agent's memory. The first instance of a new class with no owner is immediately a visible
OWNER_GAP. - This is the concrete mechanism behind the user requirement "future expansion must be automatically covered" and "detect anything outside governance automatically, like orphan detection." The detector does not need to be told a new thing exists; it discovers it from the source inventory and demands its owner.
11.14 Scale failure modes & guards
| Failure | Guard |
|---|---|
| fan-out join explodes cardinality | scalar/EXISTS resolution only (doc 05 §L3); RP 160→172 lesson |
| recursive inheritance cycles | bounded-depth, cycle-detection (RP walk depth-3 cycle-free) |
| event flood | aggregate to summary event; throttle (§11.11) |
| silent cap hides gaps | log() truncation; exact summary count (§11.6) |
| scanner dies silently | WATCHDOG + governance_scan_stale (§11.9) |
| UI scans base tables | UI reads L5 only; Điều 28 NT-D1 (§11.8) |
Cross-refs: doc 05 (the layers this scales), doc 06 (scanner cadence), doc 07 (aggregation/throttle), doc 04 (inheritance).