KB-D0EE

38 — SB-12 Source-Snapshot + Ruleset-Version Registry — Detailed Technical Design (GCOS, design-only, read-only zero mutation, 2026-06-01)

23 min read Revision 1
one-roof-governanceimplementation-indexgcossb-12source-snapshotruleset-versionevolution-snapshotsmeasurement-registrygovernance-rulesetreproducible-verdicttargeted-invalidationno-law-version-bumpreuse-firstfail-closedno-hardcodeno-islanddesign-onlyapply-no-go2026-06-01

38 — SB-12 Source-Snapshot + Ruleset-Version Registry — Detailed Technical Design

Package: knowledge/dev/reports/architecture/one-roof-governance-technical-addendum-and-implementation-index-2026-06-01/ Track: GCOS substrate (Governance Candidate & Onboarding Substrate). Blocker SB-12. Status: Detailed technical design ONLY. BUILD NO-GO. No DDL/DML, no table/view/function/trigger creation, no event registration, no DOT registration, no approval. KB document only. Reads / controls: doc 00 (controlling index) → concept canon → Round-4 law → knowledge/dev/laws/prompt-muc-tieu-mo-for-claude-code.md (operating constitution). Builds on docs 31 (backfill §5), 34 (dirty-group §2/§4), 35 (Branch E/F + SB-12 register). T6/T7 = docs 25/24 (unchanged). Date: 2026-06-01 · Mutation footprint: KB document only. Zero PG/Directus/Qdrant/Nuxt mutation. Read-only PG used for live validation (role context_pack_readonly).


38.0 §0-GOV — governed objects this design introduces

governed_object class grain purpose
source_snapshot Class-2 process record (reuses evolution_snapshots) one row per scan/backfill/audit run, per scope reproducible inventory fingerprint that a candidate verdict is keyed to
ruleset_version Class-2 governed config-version record (new governance_ruleset registry row) one row per active rule-set hash over time reproducible rule identity; targeted invalidation when rules change

Issue/event types introduced (register-before-emit, NOT registered, ride the governance domain SB-11 owns): snapshot_capture_failed, snapshot_nondeterministic, ruleset_unowned, ruleset_drift, snapshot_ref_dangling. (Severity computed per T7; routes per doc 24 §7.)


38.1 Problem statement (what SB-12 must make true)

The T6 coverage detector (doc 25) and the candidate scan (doc 34) produce verdicts ("object X is not_relevant", "group G is covered"). Doc 34's keystone rule forbids storing "checked forever." A verdict is only meaningful when it is qualified by (a) the state of the inputs it saw and (b) the rules it applied. SB-12 is the substrate that makes both reproducible and invalidatable:

  1. Source snapshot = an immutable fingerprint of the inventory/state the scan read, per scope/group, at a point in time.
  2. Ruleset version = an immutable, content-addressed identity of the active rule-set (detectors + coverage profiles + axis registry + responsibility scopes) the scan applied — versioned without bumping any law version.

Together they make every candidate-state row carry the triple (candidate_key, source_snapshot_ref, ruleset_version) (doc 31 §4.3, doc 34 §2). Re-evaluation happens only when the snapshot drifts, the ruleset bumps, or the TTL expires (TTL owned by SB-10). This is what turns "re-scan all 1.04M every pass" into "re-scan only what provably changed."


38.2 Live PG validation (read-only, re-verified 2026-06-01 — live wins over cited values)

Object Live finding (this session) Implication for SB-12
evolution_snapshots 1 row. id int PK, snapshot_at tstz NN now(), scope text NN default 'global', metrics jsonb NN, delta_previous jsonb default '{}', notes text. The single row: id=1, scope='global', snapshot_at=2026-04-04, metrics={dot_count:252, edge_count:2193, domain_count:10, entity_count:40, kg_dot_count:36, config_tables:7, quality_log_entries:0}. Shape is exactly a snapshot registry: scope + metrics jsonb + delta_previous. Only a global row exists; no per-group / per-scope governance snapshot → SB-12 reuses this table by writing governance-scoped rows. No schema change needed.
measurement_registry 142 rows, 140 enabled. Cols incl. measurement_id text, measurement_name, law_code text NN, method smallint, source_query text NN, target_query text NN, comparison text default 'strict_equals', severity, enabled bool, auto_generated bool, last_run_at/result/evidence. This is the data-driven rule content source. ruleset_version is a hash over the governance-relevant enabled subset of these rows ⊕ profile/axis/scope registries. Content (source_query/target_query/comparison) is exactly what must be hashed.
governance_ruleset ABSENT (0 columns). The ruleset-version registry must be created (greenfield) — or substituted by an evolution_snapshots row (Option A below).
governance_responsibility_scope ABSENT (SB-2 greenfield). One of the ruleset hash components; until SB-2 live, the scope component is empty/derived — ruleset still hashes deterministically over whatever components exist (fail-closed, see §38.7).
Axis Registry (M-DEF-9) ABSENT (interim via pivot_definitions=37 + law_jurisdiction=43). Another ruleset component; interim derivation hashed; absence is itself a finding (axis_unregistered), never a silent fallback.
registry_changelog 68,323 rows. (entity_type, entity_code, action, timestamp(non-tz), alert_level NN, resolved NN, changed_by, alert_detail). The single audit channel for snapshot capture + ruleset activation events.

No-law-bump anchor: normative_registry/law_catalog/governance_docs are untouched by SB-12. measurement_registry.law_code references laws but versioning the rule-set is an operational act, not a legislative one (§38.5).


38.3 Reuse / Extend / New decision

Need Decision Rationale (discover-first / reuse-first, law §5)
Source snapshot store REUSE evolution_snapshots as-is (zero schema change). The table already is a scoped, metrics-jsonb snapshot ledger with delta_previous. Governance writes rows with scope='governance:<run-kind>' and metrics = per-group fingerprint map. Minting a parallel snapshot table = a second roof (forbidden).
Per-group fingerprint REUSE — store inside evolution_snapshots.metrics jsonb as a {group_key → fingerprint} map. jsonb holds the per-group map; no per-group row explosion; one snapshot row per run (§38.4).
Ruleset-version registry NEW tiny governance_ruleset registry row (recommended, Option B)or REUSE evolution_snapshots scope='governance.ruleset' (Option A fallback). Ruleset version is a different grain (aggregate version of the whole active rule-set) and needs governable semantics (owner, activation approval, status) for C-7. A tiny additive ROW registry, modeled on the live measurement_registry/derived_objects_registry row-registry idiom, is the cleanest. Option A avoids any new table at the cost of weaker governance semantics. Council (C-7) chooses; default = B.
Hash inputs (rules-as-data) REUSE measurement_registry + (when live) profile/axis/scope registries. Rules are already data; ruleset_version is a deterministic content hash, not new rule storage.
Audit REUSE registry_changelog. One audit channel; no third log minted.

Net: SB-12 = 0 new tables for snapshots (reuse evolution_snapshots) + at most 1 new tiny registry for rulesets (governance_ruleset, or reuse evolution_snapshots). Minimal additive footprint; no island.


38.4 Source-snapshot identity, capture, and structure

Identity

source_snapshot_ref = evolution_snapshots.id (integer). A candidate-state row (SB-10) stores this integer. A snapshot is immutable once written (append-only; never updated).

Row shape (reusing evolution_snapshots)

evolution_snapshots
  id            int            -- = source_snapshot_ref carried by candidate-state rows
  snapshot_at   timestamptz    -- capture time (run start)
  scope         text           -- 'governance.backfill' | 'governance.audit' | 'governance.scan'
  metrics       jsonb          -- { run_id, ruleset_version, totals:{...},
                               --   groups: { <group_key>: { count, max_born_at, max_id,
                               --                              content_hash, source_registry } } }
  delta_previous jsonb         -- per-group delta vs prior snapshot of same scope (drift map)
  notes         text           -- worker_name, batch range, phase

Capture

  • Captured at run start by the backfill sweep (doc 31, phase='seeding'/'reconciling') and by every periodic full audit (doc 34 §5 safety net). Incremental/event-driven scans (doc 34) reference the most recent full-audit snapshot for their scope and compute only the affected group's fingerprint delta.
  • Per-group fingerprint computed by keyset aggregate over the authoritative inventory at governance grain: count(*), max(born_at), max(id), and a content_hash = md5(string_agg(entity_code ORDER BY id)) (or a rolling hash) per group_key. For 1.04M rows this is a GROUP BY over the group dimensions; executed in batches within the 5 s read timeout (Branch F §4 control #1).
  • delta_previous = for each group, {count_delta, new_max_id, fingerprint_changed:bool} vs the prior same-scope snapshot. This is the drift map that drives invalidation (§38.6).

Why per-group-in-jsonb, not per-group-row

The live evolution_snapshots is one coarse global row. Per-group rows would re-introduce row explosion (78 registries × N classes × axes). The jsonb groups map keeps it one row per run (Δrows ≈ number of runs, not number of groups) while remaining queryable (metrics->'groups'->><group_key>). This honors the "no per-row explosion" constraint (doc 34 §3, M-DEF-7).


38.5 Ruleset-version identity and versioning without law version-bump

Identity (content-addressed)

ruleset_version = 'gov-rs-' || left(
   sha256( canonical_json([
       -- component 1: enabled governance-relevant detector rows
       measurement_registry rows WHERE enabled ORDER BY measurement_id
         → (measurement_id, law_code, method, source_query, target_query, comparison, severity),
       -- component 2: coverage-profile registry (M-DEF-2)   [when live]
       -- component 3: Axis Registry (M-DEF-9)                [when live; interim: pivot_definitions+law_jurisdiction]
       -- component 4: governance_responsibility_scope rows   [SB-2; when live]
   ]) ), 12)
  • Canonical ordering is mandatory (ORDER BY each component's key) so the hash is deterministic; otherwise row-order noise causes spurious bumps (§38.7 failure mode).
  • Components that are absent (profile/axis/scope pre-SB-2/SB-3) hash as an explicit empty marker {component:'X', state:'absent'} — never silently skipped — so a later activation of that component deterministically bumps the ruleset (and raises axis_unregistered/inventory findings, never a hidden default).
governance_ruleset                                   -- NEW, additive, greenfield
  ruleset_version   text   PK   -- the 'gov-rs-<hash>' string above
  content_hash      text         -- the raw sha256
  components        jsonb         -- { measurement_ids:[...], profile_ver, axis_ver, scope_ver,
                                  --   law_codes:[...], absent:[...] }  (provenance of what was hashed)
  activated_at      timestamptz
  activated_by      text          -- owner / approver (C-7)
  approval_ref      text          -- APR id when activation is approved (Đ32) — design-only
  status            text          -- 'active' | 'superseded' | 'draft'
  supersedes        text          -- prior ruleset_version (immutable version chain)
  notes             text

Exactly one status='active' ruleset per governance scope-family at a time (partial-unique by convention; intended unique key stated here, to be confirmed at build by an operator with full privileges — live PK introspection returned empty for the read-only role).

Versioning is operational, not legislative — the no-law-bump guarantee

  • A ruleset bump is a config-version event, recorded in governance_ruleset + registry_changelog. It does not write normative_registry, law_catalog, or governance_docs; it does not enact, version-bump, or change the status of any law.
  • The ruleset references the laws it encodes via measurement_registry.law_code and components.law_codes[] (traceability), but the law version lifecycle stays entirely inside the law substrate (L-1/L-2). This cleanly separates "the rules changed because a law was amended" (a law event, upstream) from "the active detector set was re-versioned" (an operational SB-12 event, downstream). Acceptance test §38.10 #4 asserts normative_registry is unchanged across a ruleset bump.

Owner of ruleset → C-7 (open)

  • Default proposed: policy ownership = GOV-COUNCIL (a ruleset bump that changes detection scope is a policy act); GOV-SIV proposes the bump (it computes the hash and the diff); activation via APR quorum (Đ32, fn_apr_quorum_check). Auto-activation on pure additive measurement rows MAY be allowed via an allowlist (cf. doc 27 auto-approve hardening) — council decision.
  • Until C-7 rules, governance_ruleset rows may be computed and recorded (draft) but not activated; a candidate verdict referencing a draft ruleset is treated as unknown for high-risk objects (fail-closed). Unowned ruleset → finding ruleset_unowned.

38.6 Relationship to source registry / group / axis / lifecycle, and what invalidates a verdict

group_key (doc 34 §3) = hash(object_class, source_registry, axis_family, scope, lifecycle_status, owner_scope). SB-12 attaches a fingerprint to each group in a snapshot, and a single ruleset hash to the run.

A prior verdict is invalidated when (and only when):

Change Detected via Invalidates
Source drift — group's row-set changed (count, new max born_at/id, content hash) delta_previous map in next snapshot; handoff kinds #1–#6 (doc 32 §3) only the drifted group(s)dirty=true, dirty_reason='snapshot_drift' on those candidate rows
Axis introduced / axis policy changed handoff #7; Axis Registry / interim pivot_definitions change → ruleset component 3 changes groups in the affected axis_family (ruleset bump scoped to axis)
Policy changed (law amended → measurement_registry row enabled/edited; M-DEF-2 profile changed) handoff #8; ruleset component 1/2 changes groups in the changed rule's scope (not blanket)
Owner/approval/exception changed handoff #9 (SB-2/Đ32) owner_scope + scope groups
Ruleset version bump (any hashed component changes) new governance_ruleset active row; ruleset_version differs from candidate row's groups in the changed rule's scope; auto-close re-keyed by (coalesce_key, ruleset_version) (doc 34 §9, doc 35 §3.2 patch #8)
TTL expiry SB-10 stale_after the single expired candidate (time-based; not an SB-12 concern but consumes SB-12 to re-fingerprint on re-scan)

Targeted, never blanket. Snapshot drift dirties only groups whose fingerprint changed (delta_previous); a ruleset bump dirties only groups whose governing rule changed (the components.law_codes/scope mapping decides which). The whole point is Δother = 0 (acceptance §38.10 #2/#3).


38.7 Audit, retention, failure modes

Audit trail

  • Snapshot captureregistry_changelog row (entity_type='governance_snapshot', entity_code=evolution_snapshots.id, action='capture', alert_level='info', changed_by=worker_name) + the evolution_snapshots row itself is the durable artifact.
  • Ruleset activationregistry_changelog (entity_type='governance_ruleset', entity_code=ruleset_version, action='activate', changed_by=activated_by) + governance_ruleset.approval_ref.
  • No third audit channel. governance_audit_log (relation-scoped, 1 stale row) stays untouched (consistent with SB-7 decision, doc 25).

Retention

  • Snapshots: retain every snapshot referenced by any live candidate-state row (soft-FK on source_snapshot_ref). Beyond that, retain at least the last full-audit snapshot per scope + the last N (config, default 12) full-audit cycles for drift history. Prune older unreferenced snapshots only via a logged summary (snapshot_pruned count); never silent (constitution "no silent caps").
  • Rulesets: retained forever (immutable version chain via supersedes); status flips active→superseded, rows never deleted. This is the reproducibility ledger: "verdict was computed under gov-rs-abc123" must always resolve.

Failure modes

Mode Behavior (fail-closed)
Snapshot capture fails mid-run Cursor (SB-13) does not advance; partial snapshot row not referenced; retry from watermark; raise snapshot_capture_failed (high).
Non-deterministic hash (ordering / clock) Mandatory canonical ORDER BY + content-only hashing (no timestamps in hash). If two captures of an unchanged group differ → raise snapshot_nondeterministic (high); freeze ruleset bumps until resolved.
Missing source registry for a group Cannot fingerprint → raise backfill_inventory_gap (doc 31), fail closed for that group; group stays unknown (high-risk → G-PROD blocks). Never invents a fingerprint.
Dangling source_snapshot_ref (candidate references pruned snapshot) Treat candidate as stale/unknown (fail-closed); schedule re-scan; raise snapshot_ref_dangling (medium). Retention rule above prevents this for live candidates.
Ruleset unowned (C-7 unresolved) Rulesets computable as draft; verdicts under draft ruleset = unknown for high-risk; raise ruleset_unowned.
Component absent then later present (Axis Registry born) Deterministic ruleset bump (absent-marker → present) dirties affected groups; no hidden default.

38.8 Relation to backfill (doc 31) and dirty-group candidate scan (doc 34)

  • Backfill (doc 31 §5): the sweep captures the seed snapshot at phase='seeding' start and computes the initial ruleset_version; every seeded candidate row carries (source_snapshot_ref, ruleset_version). SB-12 is the registry those refs point at.
  • Dirty-group (doc 34 §2/§4): the candidate-state store keys verdicts on the SB-12 triple; invalidation triggers (§38.6) consume the snapshot delta_previous and the ruleset_version diff to decide which groups go dirty. The periodic full audit (doc 34 §5) re-captures a snapshot and re-hashes the ruleset — the safety net that catches any lost dirty signal.
  • Handoff (doc 32 §3 #8/#10): policy_changed/ruleset_changed handoff signals are exactly the triggers that cause a ruleset_version recompute; SB-12 turns them into a scoped invalidation rather than a blanket rescan.

38.9 No-hardcode / no-island attestation

  • No-hardcode: snapshot scopes, group dimensions, and ruleset components are all discovered from registries (evolution_snapshots.scope is data; group dims from meta_catalog/birth_registry/pivot_definitions/law_jurisdiction/SB-2 scopes; ruleset hash over measurement_registry rows). No axis array, no object-class list, no rule literal in code. Absent components hash as explicit absent-markers, not silent defaults.
  • No-island: snapshots reuse the one evolution_snapshots ledger; audit reuses the one registry_changelog; rules reuse the one measurement_registry. The only additive object is the tiny governance_ruleset registry (Option B) — a config-version registry, not a parallel governance roof. No second snapshot store, no second rule store, no second audit log.

38.10 Acceptance tests (must pass at build; cannot run now — design-only)

  1. Reproducibility: same enabled measurement_registry subset + same component versions ⇒ identical ruleset_version hash across repeated computation (determinism). A recorded verdict resolves to a human-readable statement: "object X was not_relevant under ruleset gov-rs-abc123, snapshot id=512 (captured 2026-06-10)."
  2. Targeted snapshot invalidation: mutate one group's inventory (add rows) ⇒ next snapshot's delta_previous flags only that group; only that group's candidate rows go dirty; Δother_groups = 0.
  3. Targeted ruleset invalidation: enable one measurement_registry row scoped to scope S ⇒ ruleset_version bumps; only groups in S are dirtied; auto-close keyed by (coalesce_key, ruleset_version) does not mask the re-open.
  4. No law version-bump: across a ruleset bump, normative_registry, law_catalog, governance_docs row-sets and versions are byte-identical (the bump is purely operational).
  5. Scale: per-group fingerprint over 1,037,724 born rows / 78 registries completes within the batch budget (each GROUP BY batch < 5 s read timeout).
  6. Fail-closed: a missing source registry / non-deterministic hash / dangling snapshot ref yields unknown/finding and blocks G-PROD for high-risk objects, never a silent "still clean."
  7. Audit completeness: every snapshot capture and ruleset activation has a matching registry_changelog row; pruning emits a summary count.

38.11 Dependencies, gates, and verdict

  • Designable now: YES (done, this doc). Build now: NO.
  • Build gates: (a) evolution_snapshots reuse needs no approval but writing governance rows is part of the GCOS build (gated with SB-10/SB-13); (b) governance_ruleset table creation = DDL = gated (operator + reversible-by-default, law §5); (c) ruleset ownership + activation policy = C-7 (council); (d) ruleset hash components that depend on SB-2 (governance_responsibility_scope) and SB-3/Axis Registry degrade gracefully until those land (absent-markers), so SB-12 is not hard-blocked by SB-2/SB-3 for design or for a born-only initial ruleset.
  • No COMMIT (os_proposal_approvals=0 ⇒ COMMIT_FORBIDDEN, H-1/H-2/SB-6).

SB-12 design verdict: COMPLETE — GO for build-prep, BUILD NO-GO. Source-snapshot = reuse evolution_snapshots (zero schema change). Ruleset-version = new tiny governance_ruleset registry (Option B, recommended) or reuse evolution_snapshots scope='governance.ruleset' (Option A) — C-7 owns the choice + ownership. Reproducible verdicts and targeted (never blanket) invalidation are achievable on the live substrate with minimal additive footprint and zero law-version coupling.

(Cross-refs: doc 31 §5, doc 34 §2/§4/§9, doc 35 §3.2 patch #8 + Branch F controls #5/#8/#10, doc 39 SB-13, doc 40 SB-10, doc 41 SB-11, doc 42 integration.)

Back to Knowledge Hub knowledge/dev/reports/architecture/one-roof-governance-technical-addendum-and-implementation-index-2026-06-01/38-sb12-source-snapshot-ruleset-version-detailed-design.md