34 — Incremental Candidate Scan / Dirty-Group Model Design (Branch D, no "checked forever", snapshot+ruleset-keyed, fail-closed prod gate, design-only, read-only zero mutation, 2026-06-01)
34 — Incremental Candidate Scan / Dirty-Group Model Design (Branch D)
Path:
knowledge/dev/reports/architecture/one-roof-governance-technical-addendum-and-implementation-index-2026-06-01/Doc: 34. Track: Branch D of the Backfill / Handoff / Input-Control addendum — the candidate layer itself. Builds on docs 31 (backfill), 32 (handoff), 33 (input gate), 24/25 (T7/T6), concept canon 01–02, GPT directions (incremental candidate scan + dirty-group invalidation). Status: DESIGN ONLY. APPLY/BUILD NO-GO. No PG/Directus/Qdrant/Nuxt mutation; no table/view/function/trigger; no DOT/event registration; no emit; no approval/enactment. This doc specifies the candidate-state model and scan, not a running scan. Owner (proposed): GOV-SIV runs the candidate scan (read/detect/verdict); it never self-applies. The candidate verdict gates whether the T6 coverage detector (doc 25) does owner/profile work on an object. Evidence base: live read-only re-verified 2026-06-01 (derived_objects_registry=7 withrefresh_strategy ∈ {realtime_trigger, on_demand, null},stale_after,recompute_status,depends_on_collections[];iu_route_worker_cursor=1;evolution_snapshots=1;system_issues=190,288 withcoalesce_key/run_id/evidence_snapshot/business_logic_hash).
0. §0-GOV declaration
§0-GOV Governance Coverage Declaration — Branch D (Candidate Scan / Dirty-Group)
governed_objects: [ candidate_state, candidate_scan_run, group_key registry ] (Class-2 process records)
owner_per_scope: { policy: GOV-COUNCIL, health: GOV-SIV, execution: GOV-DOT,
render: GOV-MOUT(TTL), approval: Điều32, audit: GOV-SIV }
coverage_profile: [ scan-state profile — owner, audit, rollback, heartbeat, issue-event, stale-TTL ]
axes_introduced: [ none — consumes Axis Registry / group_key dimensions ]
detection_path: candidate-state store keyed by (candidate_key, source_snapshot, ruleset_version)
issue_event_types: [ candidate_stale, candidate_unknown, candidate_scan_lag,
group_invalidation_storm ] (register-before-emit, Điều 45 — NOT registered)
exceptions: [ none minted ]
1. The principle (GPT direction, verbatim intent)
"Do not store 'object already checked' as a permanent truth. Store scan results by object/group + source snapshot + ruleset version + time. When the group, source registry, policy, axis, owner, approval, or ruleset changes, invalidate/rescan the affected group."
Branch D is the candidate layer between Registry and Coverage:
Birth → Registry → [ Candidate Layer (this doc): detect possible governance relevance,
keyed by snapshot+ruleset, invalidated by dirty group/source ]
→ Coverage Layer (T6 L3–L6): check owner/approval/audit/rollback/…
→ Production Gate: block if coverage required but missing/stale/unknown
It exists so the T6 scanner stops re-deriving the full 1.04M-object set every pass (doc 25 L1) and instead does expensive owner/profile work only on the dirty and expired candidate set — while never letting a once-clean verdict become a silent stale truth.
2. The candidate-state store — never "checked forever"
The core record (proposed, design-only, table ABSENT — new blocker SB-10). It is modeled on the live derived_objects_registry dirty/stale machinery (the system already does exactly this for derived objects) and on system_issues' fingerprint columns — reuse-first, not a novel scheme:
governance_candidate_state (proposed — design only; no DDL here)
candidate_key text -- canonical_address | (object_type,object_ref) ; idempotency key (doc 31 §6)
group_key text -- the dirtying/rescan unit (see §3)
object_type text
object_ref text
source_snapshot_ref int -- FK → evolution_snapshots (doc 31 §5)
ruleset_version text -- hash over enabled detectors+profiles+axes+scopes (doc 31 §5)
candidate_verdict text -- relevant | not_relevant | class_0 | deferred_birth | retired | needs_input | unknown
input_quality_state text -- FK semantics → doc 33 §4 (only 'accepted...' proceeds to coverage)
scan_time timestamptz -- when this verdict was computed
evidence_fingerprint text -- state hash of the inputs that produced the verdict (reuse business_logic_hash idea)
dirty boolean -- set by an invalidation trigger (§4)
dirty_reason text -- which change dirtied it (handoff kind / ruleset bump / snapshot drift)
dirtied_at timestamptz
stale_after timestamptz -- TTL by risk class (§6) — reuse derived_objects_registry.stale_after
last_run_id text -- reuse system_issues.run_id linkage
audit_ref -- registry_changelog linkage
The anti-"checked-forever" rule: a verdict is always qualified by the triple (source_snapshot_ref, ruleset_version, scan_time). There is no boolean "object is governed/checked". A clean verdict means "under snapshot S, ruleset R, at time T, this was not_relevant" — and it is automatically invalid when S, R, or the TTL changes. This is the precise behavior the GPT council required.
Reuse evidence — derived_objects_registry already implements this for derived objects (live): depends_on_collections[] + depends_on_edges[] (the dependency graph that says what dirties me), stale_after (TTL), recompute_status, stale_reason, refresh_strategy ∈ {realtime_trigger, on_demand}. Branch D applies the same pattern to governance candidates. No new dirty-tracking engine is invented.
3. group_key design (mission §7)
The group_key is the unit of dirtying and re-evaluation — coarse enough to coalesce (avoid per-row work and per-row issues, M-DEF-7/§10), fine enough to invalidate precisely (don't dirty 1.04M rows when one collection changes). It is a computed tuple, not a code list:
group_key = hash( object_class, source_registry, axis_family, scope, lifecycle_status, owner_scope )
| Dimension | Source (live, no-hardcode) | Why it bounds invalidation |
|---|---|---|
object_class |
meta_catalog.entity_type (169) |
a class-level rule/profile change dirties only that class |
source_registry |
birth_registry.collection_name (78 distinct) / meta_catalog.registry_collection |
a collection/count/source change dirties only that registry's group |
axis_family |
Axis Registry axis_family (M-DEF-9) / interim pivot_definitions dims |
an axis introduction/policy change dirties only that axis family |
scope |
governance_responsibility_scope (6 SB-2 rows: policy/health/execution/render/approval/audit) |
a scope-specific policy change dirties only that scope's coverage |
lifecycle_status |
birth_registry.status / per-class status |
retirement/supersession dirties only affected lifecycle bucket |
owner_scope |
owner-scope from SB-2 (when live) | an owner/approval/exception change dirties only that owner's groups |
Grain alignment (M-DEF-7). The candidate scan keys at the governance grain (roots + non-inheriting + containers), so adding 10⁶/10⁸ inherited children under one anchored container dirties one group, not millions of rows (Δ work = 0 for inherited leaves). This is the same anti-spam grain T7/T6 use.
4. Dirty-group / dirty-source invalidation (mission §7)
Invalidation is event-driven (a change marks a group dirty) plus a periodic safety net (TTL expiry). Triggers, and the precise scope each dirties:
| Invalidation trigger | Source (doc) | Dirties |
|---|---|---|
| Handoff signal (object born/registered/retired/collection-changed/count-changed/source-changed) | doc 32 §3 | the affected (object_class, source_registry, lifecycle) group |
| Axis introduced / axis policy changed | doc 32 #7 | the axis_family group |
| Policy changed (law/normative/measurement) | doc 32 #8 | the scope group(s) the rule governs |
| Owner / approval / exception changed | doc 32 #9 (SB-2/Đ32) | the owner_scope + scope group |
| Ruleset version bump | doc 31 §5 | all groups in the changed rule's scope (not a blanket rescan — scoped by the rule's coverage_rule) |
| Source snapshot drift | doc 31 §5 (evolution_snapshots delta) |
the groups whose fingerprint changed |
TTL expiry (stale_after) |
§6 | the single expired candidate (periodic re-validation) |
| Input correction / late data | doc 33 §5 | the corrected candidate's group |
Dirty propagation, not dirty explosion. A change dirties the smallest group_key it provably affects (the dependency map, reuse of derived_objects_registry.depends_on_*). A storm of changes to one group coalesces to a single dirty mark within the coalesce window (doc 32 §6); if a single tick dirties more than a configured fraction of all groups, that itself is a finding (group_invalidation_storm) and the scan throttles (Branch F).
5. The three scan modes (mission §7) — mirror derived_objects_registry.refresh_strategy
The live derived_objects_registry already runs three refresh strategies (realtime_trigger, on_demand, plus periodic via stale_after). Branch D reuses the same trichotomy:
| Mode | Analog | What it scans | Cadence |
|---|---|---|---|
| Event-driven scan | realtime_trigger |
the group(s) just marked dirty by a handoff | on dirty-mark (debounced by coalesce window) |
| Incremental scan | on_demand |
the current dirty set (groups with dirty=true) |
frequent (e.g. minutes); clears dirty on clean verdict |
| Periodic full audit | stale_after TTL |
every candidate whose stale_after has passed plus a bounded full reconciliation sweep |
infrequent (e.g. weekly) — the safety net that catches missed invalidations and proves the invariant still closes |
Why all three (the GPT requirement): event-driven keeps it fresh; incremental keeps it cheap; the periodic full audit is the non-negotiable safety net — it guarantees that even if a dirty signal was somehow lost (a missed trigger, an un-tailed change), every candidate is re-validated within its TTL, so a stale verdict cannot live forever. Backfill (doc 31) is the first periodic-full pass (phase='seeding').
6. Stale-scan expiry + fail-closed production gate (mission §7)
- Stale TTL by risk class —
stale_after = scan_time + ttl(object_risk_class). High-risk/write objects get a short TTL (re-validated often); read-only/descriptive objects a long TTL. TTLs are config (no literal in code), keyed by risk class derived from the coverage profile (M-DEF-2). - Stale ⇒ verdict downgraded — past
stale_afterwith no re-scan,candidate_verdict → stale/unknown(it is not silently treated as still-clean). This is the "do not store object checked forever" rule enforced at read time, not just write time. - Fail-closed production gate (concept §11 readiness gate G-PROD) — when a high-risk object's candidate status is
stale/unknown, the production/execution gate blocks (a true gate must block in deploy). A low-risk object with a stale verdict yields a scheduled re-scan, not a block. This realizes the GPT requirement: "production gate fail-closed when candidate status is stale/unknown for high-risk objects." - Stale fails closed everywhere — consistent with T7 §6.6 (a coalesced finding with a stale fingerprint stays open) and M-DEF-6 (a stale exception fingerprint lifts suppression). Unknown ≠ safe.
7. count>1 as candidacy trigger, not mandate (M-DEF-10)
The candidate scan uses count>1 (multiplicity of a class/dimension — e.g. v_registry_counts, a registry count change in doc 32 #4) to nominate a group for evaluation. It then resolves the nominee via the shared-truth predicate + governance grain:
- a dimension → axis (M-DEF-8/9) →
axis_unregisteredcandidacy; - an independently-authoritative object → governed-object candidacy;
- a child under a container → inherits owner-link only (no separate candidacy).
Multiplicity alone creates no owner requirement and no issue. Two personal prefs → not_relevant/class_0 (0 issue); 10⁶ children → one anchored group (Δ=0); two production routes → relevant → coverage scan. The candidate scan must never mint an owner or an issue purely from a count — it only sets candidate_verdict='relevant' and hands the resolution to the T6 detector + the Đ32 approval spine.
8. Candidate verdict → what happens next
candidate_verdict |
Meaning | Next |
|---|---|---|
relevant |
governance-relevant; input accepted | → T6 coverage detector (doc 25 L3–L6): owner/profile/anarchy/invariant |
not_relevant |
evaluated, not governance-relevant under (S,R) | record verdict; no issue; re-evaluate only if dirtied/expired |
class_0 |
private/user/session-scoped (M-DEF-1) | suppress (unless reaches shared truth → dirty) |
deferred_birth |
unborn/unregistered (M-DEF-4) | yield to Đ19; 0 governance issue |
retired |
retired/superseded/merged-away (doc 31 §7) | counted in invariant retired/ignored |
needs_input |
failed input gate (doc 33) | route input-quality issue; hold |
unknown/stale |
TTL expired or never seeded | fail-closed for high-risk (§6); schedule re-scan / backfill |
Only relevant consumes expensive T6 owner/profile work — which is the whole efficiency point.
9. Relation to the T6 governance coverage scan (doc 25) — the key reconciliation
Branch D changes T6's L1 input contract (full patch plan in doc 35 §3):
- Before (doc 25): T6 L1 enumerates every governed object + axis from registries on every pass; auto-close on "next clean scan."
- After (Branch D): T6 reads its working set from the candidate-state store: the
dirtyset (incremental) + thestale_after-expired set (periodic) + the periodic full-audit set. L1's full enumeration runs only during the periodic full audit (and the initial backfill), not every pass. T6's L2 birth-precedence is now pre-enforced by the input gate (doc 33) but retained as defense-in-depth. T6 auto-close is re-keyed by(coalesce_key, ruleset_version)so a close under an old ruleset cannot mask a needed re-open under a new one.
This makes T6 incremental and scalable without changing its 6-layer logic, its 7-DOT family, its 20 findings, or its anti-spam — those all stay valid (doc 35 §3).
10. Candidate issue types (proposed, register-before-emit — NOT registered)
Ride system_issues + governance domain; anti-spam reused from T7. Coalesce key: gov:candidate:{state}:{group_key}.
| issue_type | bucket | severity (base→max) | detection event | auto-close |
|---|---|---|---|---|
candidate_stale |
sai_lệch_dữ_liệu |
medium→high (high-risk) | governance.candidate.stale |
re-scanned under current (S,R) |
candidate_unknown |
thiếu_quan_hệ |
medium→high | governance.candidate.unknown |
seeded/evaluated |
candidate_scan_lag |
silent_fail |
medium→high | governance.candidate.scan_lag |
dirty queue drained under threshold |
group_invalidation_storm |
sai_lệch_dữ_liệu |
high | governance.candidate.invalidation_storm |
invalidation rate normal |
Plus a governance.candidate.scan_completed heartbeat per run (silence ≠ health, Đ45).
11. No-hardcode / no-island
- No-hardcode —
group_keydimensions are computed frommeta_catalog/birth_registry/Axis Registry/scope rows;ruleset_versionis a hash over data registries; TTLs and trust are config; verdicts are computed. No object/axis/group literal is embedded. - No-island — the candidate-state store is the only new record kind, modeled on the live
derived_objects_registrypattern and keyed intosystem_issues/event_outbox/registry_changelog/evolution_snapshots/iu_route_worker_cursor. It mints no parallel scanner, bus, or audit; it feeds the one T6 detector.
12. Dependencies, gates, NO-GO
| Capability | Needs | Status |
|---|---|---|
| Candidate-scan logic + verdict (read-only, dry-run) | nothing live | designable now |
| Persist candidate-state (the store) | SB-10 (new — candidate-state store) | build NO-GO |
| Source snapshot + ruleset registry | SB-12 (new — reuse evolution_snapshots/measurement_registry) |
build NO-GO |
| Dirty-marks from handoff | SB-11 (governance domain) + doc 32 | NO-GO |
| Owner-relevant candidacy (relevant→coverage owner work) | SB-2 views | gated; degrades pre-SB-2 |
| Fail-closed production gate wiring | readiness-gate build (T11) + concept §11 | NO-GO |
No gate may be satisfied by self-approval.
13. Verdict
Branch D incremental candidate-scan / dirty-group design: COMPLETE. It specifies: the candidate-state store keyed by (candidate_key/group_key, source_snapshot_ref, ruleset_version, scan_time) with no "checked-forever" boolean (§2); the computed group_key dimensions (§3); the full invalidation-trigger set with scoped (not blanket) dirtying (§4); the three scan modes — event-driven + incremental + periodic full audit — mirroring the live derived_objects_registry.refresh_strategy (§5); stale-TTL expiry and the fail-closed production gate for stale/unknown high-risk objects (§6); count>1 as candidacy-not-mandate (§7); the candidate verdict → next-step map (§8); and the exact reconciliation that makes T6 incremental without changing its logic (§9). It reuses derived_objects_registry (pattern), evolution_snapshots, iu_route_worker_cursor, system_issues, event_outbox, registry_changelog — no island, no hardcode. Build/apply NO-GO (SB-10/SB-11/SB-12); nothing registered, emitted, or mutated. Next: doc 35 (T6/T7 compatibility patch plan + scale/resource budget + new-blocker register + readiness map).