KB-33DC

34 — Incremental Candidate Scan / Dirty-Group Model Design (Branch D, no "checked forever", snapshot+ruleset-keyed, fail-closed prod gate, design-only, read-only zero mutation, 2026-06-01)

19 min read Revision 1
one-roof-governanceimplementation-indexincremental-candidate-scanbranch-ddirty-groupno-checked-foreversource-snapshotruleset-versionscan-timegroup-keyinvalidation-triggersevent-drivenincrementalperiodic-full-auditstale-expirycount-gt-1-candidacycandidate-verdictcoverage-scan-relationfail-closed-prod-gatederived-objects-precedentno-islanddesign-only2026-06-01

34 — Incremental Candidate Scan / Dirty-Group Model Design (Branch D)

Path: knowledge/dev/reports/architecture/one-roof-governance-technical-addendum-and-implementation-index-2026-06-01/ Doc: 34. Track: Branch D of the Backfill / Handoff / Input-Control addendum — the candidate layer itself. Builds on docs 31 (backfill), 32 (handoff), 33 (input gate), 24/25 (T7/T6), concept canon 01–02, GPT directions (incremental candidate scan + dirty-group invalidation). Status: DESIGN ONLY. APPLY/BUILD NO-GO. No PG/Directus/Qdrant/Nuxt mutation; no table/view/function/trigger; no DOT/event registration; no emit; no approval/enactment. This doc specifies the candidate-state model and scan, not a running scan. Owner (proposed): GOV-SIV runs the candidate scan (read/detect/verdict); it never self-applies. The candidate verdict gates whether the T6 coverage detector (doc 25) does owner/profile work on an object. Evidence base: live read-only re-verified 2026-06-01 (derived_objects_registry=7 with refresh_strategy ∈ {realtime_trigger, on_demand, null}, stale_after, recompute_status, depends_on_collections[]; iu_route_worker_cursor=1; evolution_snapshots=1; system_issues=190,288 with coalesce_key/run_id/evidence_snapshot/business_logic_hash).


0. §0-GOV declaration

§0-GOV Governance Coverage Declaration — Branch D (Candidate Scan / Dirty-Group)
  governed_objects:   [ candidate_state, candidate_scan_run, group_key registry ] (Class-2 process records)
  owner_per_scope:    { policy: GOV-COUNCIL, health: GOV-SIV, execution: GOV-DOT,
                        render: GOV-MOUT(TTL), approval: Điều32, audit: GOV-SIV }
  coverage_profile:   [ scan-state profile — owner, audit, rollback, heartbeat, issue-event, stale-TTL ]
  axes_introduced:    [ none — consumes Axis Registry / group_key dimensions ]
  detection_path:     candidate-state store keyed by (candidate_key, source_snapshot, ruleset_version)
  issue_event_types:  [ candidate_stale, candidate_unknown, candidate_scan_lag,
                        group_invalidation_storm ] (register-before-emit, Điều 45 — NOT registered)
  exceptions:         [ none minted ]

1. The principle (GPT direction, verbatim intent)

"Do not store 'object already checked' as a permanent truth. Store scan results by object/group + source snapshot + ruleset version + time. When the group, source registry, policy, axis, owner, approval, or ruleset changes, invalidate/rescan the affected group."

Branch D is the candidate layer between Registry and Coverage:

Birth → Registry → [ Candidate Layer (this doc): detect possible governance relevance,
                     keyed by snapshot+ruleset, invalidated by dirty group/source ]
                 → Coverage Layer (T6 L3–L6): check owner/approval/audit/rollback/…
                 → Production Gate: block if coverage required but missing/stale/unknown

It exists so the T6 scanner stops re-deriving the full 1.04M-object set every pass (doc 25 L1) and instead does expensive owner/profile work only on the dirty and expired candidate set — while never letting a once-clean verdict become a silent stale truth.


2. The candidate-state store — never "checked forever"

The core record (proposed, design-only, table ABSENT — new blocker SB-10). It is modeled on the live derived_objects_registry dirty/stale machinery (the system already does exactly this for derived objects) and on system_issues' fingerprint columns — reuse-first, not a novel scheme:

governance_candidate_state   (proposed — design only; no DDL here)
  candidate_key       text  -- canonical_address | (object_type,object_ref) ; idempotency key (doc 31 §6)
  group_key           text  -- the dirtying/rescan unit (see §3)
  object_type         text
  object_ref          text
  source_snapshot_ref int   -- FK → evolution_snapshots (doc 31 §5)
  ruleset_version     text  -- hash over enabled detectors+profiles+axes+scopes (doc 31 §5)
  candidate_verdict   text  -- relevant | not_relevant | class_0 | deferred_birth | retired | needs_input | unknown
  input_quality_state text  -- FK semantics → doc 33 §4 (only 'accepted...' proceeds to coverage)
  scan_time           timestamptz  -- when this verdict was computed
  evidence_fingerprint text -- state hash of the inputs that produced the verdict (reuse business_logic_hash idea)
  dirty               boolean       -- set by an invalidation trigger (§4)
  dirty_reason        text          -- which change dirtied it (handoff kind / ruleset bump / snapshot drift)
  dirtied_at          timestamptz
  stale_after         timestamptz   -- TTL by risk class (§6) — reuse derived_objects_registry.stale_after
  last_run_id         text          -- reuse system_issues.run_id linkage
  audit_ref           -- registry_changelog linkage

The anti-"checked-forever" rule: a verdict is always qualified by the triple (source_snapshot_ref, ruleset_version, scan_time). There is no boolean "object is governed/checked". A clean verdict means "under snapshot S, ruleset R, at time T, this was not_relevant" — and it is automatically invalid when S, R, or the TTL changes. This is the precise behavior the GPT council required.

Reuse evidence — derived_objects_registry already implements this for derived objects (live): depends_on_collections[] + depends_on_edges[] (the dependency graph that says what dirties me), stale_after (TTL), recompute_status, stale_reason, refresh_strategy ∈ {realtime_trigger, on_demand}. Branch D applies the same pattern to governance candidates. No new dirty-tracking engine is invented.


3. group_key design (mission §7)

The group_key is the unit of dirtying and re-evaluation — coarse enough to coalesce (avoid per-row work and per-row issues, M-DEF-7/§10), fine enough to invalidate precisely (don't dirty 1.04M rows when one collection changes). It is a computed tuple, not a code list:

group_key = hash( object_class, source_registry, axis_family, scope, lifecycle_status, owner_scope )
Dimension Source (live, no-hardcode) Why it bounds invalidation
object_class meta_catalog.entity_type (169) a class-level rule/profile change dirties only that class
source_registry birth_registry.collection_name (78 distinct) / meta_catalog.registry_collection a collection/count/source change dirties only that registry's group
axis_family Axis Registry axis_family (M-DEF-9) / interim pivot_definitions dims an axis introduction/policy change dirties only that axis family
scope governance_responsibility_scope (6 SB-2 rows: policy/health/execution/render/approval/audit) a scope-specific policy change dirties only that scope's coverage
lifecycle_status birth_registry.status / per-class status retirement/supersession dirties only affected lifecycle bucket
owner_scope owner-scope from SB-2 (when live) an owner/approval/exception change dirties only that owner's groups

Grain alignment (M-DEF-7). The candidate scan keys at the governance grain (roots + non-inheriting + containers), so adding 10⁶/10⁸ inherited children under one anchored container dirties one group, not millions of rows (Δ work = 0 for inherited leaves). This is the same anti-spam grain T7/T6 use.


4. Dirty-group / dirty-source invalidation (mission §7)

Invalidation is event-driven (a change marks a group dirty) plus a periodic safety net (TTL expiry). Triggers, and the precise scope each dirties:

Invalidation trigger Source (doc) Dirties
Handoff signal (object born/registered/retired/collection-changed/count-changed/source-changed) doc 32 §3 the affected (object_class, source_registry, lifecycle) group
Axis introduced / axis policy changed doc 32 #7 the axis_family group
Policy changed (law/normative/measurement) doc 32 #8 the scope group(s) the rule governs
Owner / approval / exception changed doc 32 #9 (SB-2/Đ32) the owner_scope + scope group
Ruleset version bump doc 31 §5 all groups in the changed rule's scope (not a blanket rescan — scoped by the rule's coverage_rule)
Source snapshot drift doc 31 §5 (evolution_snapshots delta) the groups whose fingerprint changed
TTL expiry (stale_after) §6 the single expired candidate (periodic re-validation)
Input correction / late data doc 33 §5 the corrected candidate's group

Dirty propagation, not dirty explosion. A change dirties the smallest group_key it provably affects (the dependency map, reuse of derived_objects_registry.depends_on_*). A storm of changes to one group coalesces to a single dirty mark within the coalesce window (doc 32 §6); if a single tick dirties more than a configured fraction of all groups, that itself is a finding (group_invalidation_storm) and the scan throttles (Branch F).


5. The three scan modes (mission §7) — mirror derived_objects_registry.refresh_strategy

The live derived_objects_registry already runs three refresh strategies (realtime_trigger, on_demand, plus periodic via stale_after). Branch D reuses the same trichotomy:

Mode Analog What it scans Cadence
Event-driven scan realtime_trigger the group(s) just marked dirty by a handoff on dirty-mark (debounced by coalesce window)
Incremental scan on_demand the current dirty set (groups with dirty=true) frequent (e.g. minutes); clears dirty on clean verdict
Periodic full audit stale_after TTL every candidate whose stale_after has passed plus a bounded full reconciliation sweep infrequent (e.g. weekly) — the safety net that catches missed invalidations and proves the invariant still closes

Why all three (the GPT requirement): event-driven keeps it fresh; incremental keeps it cheap; the periodic full audit is the non-negotiable safety net — it guarantees that even if a dirty signal was somehow lost (a missed trigger, an un-tailed change), every candidate is re-validated within its TTL, so a stale verdict cannot live forever. Backfill (doc 31) is the first periodic-full pass (phase='seeding').


6. Stale-scan expiry + fail-closed production gate (mission §7)

  • Stale TTL by risk classstale_after = scan_time + ttl(object_risk_class). High-risk/write objects get a short TTL (re-validated often); read-only/descriptive objects a long TTL. TTLs are config (no literal in code), keyed by risk class derived from the coverage profile (M-DEF-2).
  • Stale ⇒ verdict downgraded — past stale_after with no re-scan, candidate_verdict → stale/unknown (it is not silently treated as still-clean). This is the "do not store object checked forever" rule enforced at read time, not just write time.
  • Fail-closed production gate (concept §11 readiness gate G-PROD) — when a high-risk object's candidate status is stale/unknown, the production/execution gate blocks (a true gate must block in deploy). A low-risk object with a stale verdict yields a scheduled re-scan, not a block. This realizes the GPT requirement: "production gate fail-closed when candidate status is stale/unknown for high-risk objects."
  • Stale fails closed everywhere — consistent with T7 §6.6 (a coalesced finding with a stale fingerprint stays open) and M-DEF-6 (a stale exception fingerprint lifts suppression). Unknown ≠ safe.

7. count>1 as candidacy trigger, not mandate (M-DEF-10)

The candidate scan uses count>1 (multiplicity of a class/dimension — e.g. v_registry_counts, a registry count change in doc 32 #4) to nominate a group for evaluation. It then resolves the nominee via the shared-truth predicate + governance grain:

  • a dimension → axis (M-DEF-8/9) → axis_unregistered candidacy;
  • an independently-authoritative object → governed-object candidacy;
  • a child under a container → inherits owner-link only (no separate candidacy).

Multiplicity alone creates no owner requirement and no issue. Two personal prefs → not_relevant/class_0 (0 issue); 10⁶ children → one anchored group (Δ=0); two production routes → relevant → coverage scan. The candidate scan must never mint an owner or an issue purely from a count — it only sets candidate_verdict='relevant' and hands the resolution to the T6 detector + the Đ32 approval spine.


8. Candidate verdict → what happens next

candidate_verdict Meaning Next
relevant governance-relevant; input accepted T6 coverage detector (doc 25 L3–L6): owner/profile/anarchy/invariant
not_relevant evaluated, not governance-relevant under (S,R) record verdict; no issue; re-evaluate only if dirtied/expired
class_0 private/user/session-scoped (M-DEF-1) suppress (unless reaches shared truth → dirty)
deferred_birth unborn/unregistered (M-DEF-4) yield to Đ19; 0 governance issue
retired retired/superseded/merged-away (doc 31 §7) counted in invariant retired/ignored
needs_input failed input gate (doc 33) route input-quality issue; hold
unknown/stale TTL expired or never seeded fail-closed for high-risk (§6); schedule re-scan / backfill

Only relevant consumes expensive T6 owner/profile work — which is the whole efficiency point.


9. Relation to the T6 governance coverage scan (doc 25) — the key reconciliation

Branch D changes T6's L1 input contract (full patch plan in doc 35 §3):

  • Before (doc 25): T6 L1 enumerates every governed object + axis from registries on every pass; auto-close on "next clean scan."
  • After (Branch D): T6 reads its working set from the candidate-state store: the dirty set (incremental) + the stale_after-expired set (periodic) + the periodic full-audit set. L1's full enumeration runs only during the periodic full audit (and the initial backfill), not every pass. T6's L2 birth-precedence is now pre-enforced by the input gate (doc 33) but retained as defense-in-depth. T6 auto-close is re-keyed by (coalesce_key, ruleset_version) so a close under an old ruleset cannot mask a needed re-open under a new one.

This makes T6 incremental and scalable without changing its 6-layer logic, its 7-DOT family, its 20 findings, or its anti-spam — those all stay valid (doc 35 §3).


10. Candidate issue types (proposed, register-before-emit — NOT registered)

Ride system_issues + governance domain; anti-spam reused from T7. Coalesce key: gov:candidate:{state}:{group_key}.

issue_type bucket severity (base→max) detection event auto-close
candidate_stale sai_lệch_dữ_liệu medium→high (high-risk) governance.candidate.stale re-scanned under current (S,R)
candidate_unknown thiếu_quan_hệ medium→high governance.candidate.unknown seeded/evaluated
candidate_scan_lag silent_fail medium→high governance.candidate.scan_lag dirty queue drained under threshold
group_invalidation_storm sai_lệch_dữ_liệu high governance.candidate.invalidation_storm invalidation rate normal

Plus a governance.candidate.scan_completed heartbeat per run (silence ≠ health, Đ45).


11. No-hardcode / no-island

  • No-hardcodegroup_key dimensions are computed from meta_catalog/birth_registry/Axis Registry/scope rows; ruleset_version is a hash over data registries; TTLs and trust are config; verdicts are computed. No object/axis/group literal is embedded.
  • No-island — the candidate-state store is the only new record kind, modeled on the live derived_objects_registry pattern and keyed into system_issues/event_outbox/registry_changelog/evolution_snapshots/iu_route_worker_cursor. It mints no parallel scanner, bus, or audit; it feeds the one T6 detector.

12. Dependencies, gates, NO-GO

Capability Needs Status
Candidate-scan logic + verdict (read-only, dry-run) nothing live designable now
Persist candidate-state (the store) SB-10 (new — candidate-state store) build NO-GO
Source snapshot + ruleset registry SB-12 (new — reuse evolution_snapshots/measurement_registry) build NO-GO
Dirty-marks from handoff SB-11 (governance domain) + doc 32 NO-GO
Owner-relevant candidacy (relevant→coverage owner work) SB-2 views gated; degrades pre-SB-2
Fail-closed production gate wiring readiness-gate build (T11) + concept §11 NO-GO

No gate may be satisfied by self-approval.


13. Verdict

Branch D incremental candidate-scan / dirty-group design: COMPLETE. It specifies: the candidate-state store keyed by (candidate_key/group_key, source_snapshot_ref, ruleset_version, scan_time) with no "checked-forever" boolean (§2); the computed group_key dimensions (§3); the full invalidation-trigger set with scoped (not blanket) dirtying (§4); the three scan modes — event-driven + incremental + periodic full audit — mirroring the live derived_objects_registry.refresh_strategy (§5); stale-TTL expiry and the fail-closed production gate for stale/unknown high-risk objects (§6); count>1 as candidacy-not-mandate (§7); the candidate verdict → next-step map (§8); and the exact reconciliation that makes T6 incremental without changing its logic (§9). It reuses derived_objects_registry (pattern), evolution_snapshots, iu_route_worker_cursor, system_issues, event_outbox, registry_changelogno island, no hardcode. Build/apply NO-GO (SB-10/SB-11/SB-12); nothing registered, emitted, or mutated. Next: doc 35 (T6/T7 compatibility patch plan + scale/resource budget + new-blocker register + readiness map).

Back to Knowledge Hub knowledge/dev/reports/architecture/one-roof-governance-technical-addendum-and-implementation-index-2026-06-01/34-incremental-candidate-scan-dirty-group-design.md