38 — SB-12 Source-Snapshot + Ruleset-Version Registry — Detailed Technical Design

Package: knowledge/dev/reports/architecture/one-roof-governance-technical-addendum-and-implementation-index-2026-06-01/ Track: GCOS substrate (Governance Candidate & Onboarding Substrate). Blocker SB-12. Status: Detailed technical design ONLY. BUILD NO-GO. No DDL/DML, no table/view/function/trigger creation, no event registration, no DOT registration, no approval. KB document only. Reads / controls: doc 00 (controlling index) → concept canon → Round-4 law → knowledge/dev/laws/prompt-muc-tieu-mo-for-claude-code.md (operating constitution). Builds on docs 31 (backfill §5), 34 (dirty-group §2/§4), 35 (Branch E/F + SB-12 register). T6/T7 = docs 25/24 (unchanged). Date: 2026-06-01 · Mutation footprint: KB document only. Zero PG/Directus/Qdrant/Nuxt mutation. Read-only PG used for live validation (role context_pack_readonly).

38.0 §0-GOV — governed objects this design introduces

governed_object	class	grain	purpose
`source_snapshot`	Class-2 process record (reuses `evolution_snapshots`)	one row per scan/backfill/audit run, per scope	reproducible inventory fingerprint that a candidate verdict is keyed to
`ruleset_version`	Class-2 governed config-version record (new `governance_ruleset` registry row)	one row per active rule-set hash over time	reproducible rule identity; targeted invalidation when rules change

Issue/event types introduced (register-before-emit, NOT registered, ride the governance domain SB-11 owns): snapshot_capture_failed, snapshot_nondeterministic, ruleset_unowned, ruleset_drift, snapshot_ref_dangling. (Severity computed per T7; routes per doc 24 §7.)

38.1 Problem statement (what SB-12 must make true)

The T6 coverage detector (doc 25) and the candidate scan (doc 34) produce verdicts ("object X is not_relevant", "group G is covered"). Doc 34's keystone rule forbids storing "checked forever." A verdict is only meaningful when it is qualified by (a) the state of the inputs it saw and (b) the rules it applied. SB-12 is the substrate that makes both reproducible and invalidatable:

Source snapshot = an immutable fingerprint of the inventory/state the scan read, per scope/group, at a point in time.
Ruleset version = an immutable, content-addressed identity of the active rule-set (detectors + coverage profiles + axis registry + responsibility scopes) the scan applied — versioned without bumping any law version.

Together they make every candidate-state row carry the triple (candidate_key, source_snapshot_ref, ruleset_version) (doc 31 §4.3, doc 34 §2). Re-evaluation happens only when the snapshot drifts, the ruleset bumps, or the TTL expires (TTL owned by SB-10). This is what turns "re-scan all 1.04M every pass" into "re-scan only what provably changed."

38.2 Live PG validation (read-only, re-verified 2026-06-01 — live wins over cited values)

Object	Live finding (this session)	Implication for SB-12
`evolution_snapshots`	1 row. `id int PK`, `snapshot_at tstz NN now()`, `scope text NN default 'global'`, `metrics jsonb NN`, `delta_previous jsonb default '{}'`, `notes text`. The single row: `id=1, scope='global', snapshot_at=2026-04-04`, `metrics={dot_count:252, edge_count:2193, domain_count:10, entity_count:40, kg_dot_count:36, config_tables:7, quality_log_entries:0}`.	Shape is exactly a snapshot registry: `scope` + `metrics jsonb` + `delta_previous`. Only a global row exists; no per-group / per-scope governance snapshot → SB-12 reuses this table by writing governance-scoped rows. No schema change needed.
`measurement_registry`	142 rows, 140 enabled. Cols incl. `measurement_id text`, `measurement_name`, `law_code text NN`, `method smallint`, `source_query text NN`, `target_query text NN`, `comparison text default 'strict_equals'`, `severity`, `enabled bool`, `auto_generated bool`, `last_run_at/result/evidence`.	This is the data-driven rule content source. `ruleset_version` is a hash over the governance-relevant enabled subset of these rows ⊕ profile/axis/scope registries. Content (`source_query`/`target_query`/`comparison`) is exactly what must be hashed.
`governance_ruleset`	ABSENT (0 columns).	The ruleset-version registry must be created (greenfield) — or substituted by an `evolution_snapshots` row (Option A below).
`governance_responsibility_scope`	ABSENT (SB-2 greenfield).	One of the ruleset hash components; until SB-2 live, the scope component is empty/derived — ruleset still hashes deterministically over whatever components exist (fail-closed, see §38.7).
Axis Registry (M-DEF-9)	ABSENT (interim via `pivot_definitions`=37 + `law_jurisdiction`=43).	Another ruleset component; interim derivation hashed; absence is itself a finding (`axis_unregistered`), never a silent fallback.
`registry_changelog`	68,323 rows. `(entity_type, entity_code, action, timestamp(non-tz), alert_level NN, resolved NN, changed_by, alert_detail)`.	The single audit channel for snapshot capture + ruleset activation events.

No-law-bump anchor: normative_registry/law_catalog/governance_docs are untouched by SB-12. measurement_registry.law_code references laws but versioning the rule-set is an operational act, not a legislative one (§38.5).

38.3 Reuse / Extend / New decision

Need	Decision	Rationale (discover-first / reuse-first, law §5)
Source snapshot store	REUSE `evolution_snapshots` as-is (zero schema change).	The table already is a scoped, metrics-jsonb snapshot ledger with `delta_previous`. Governance writes rows with `scope='governance:<run-kind>'` and `metrics` = per-group fingerprint map. Minting a parallel snapshot table = a second roof (forbidden).
Per-group fingerprint	REUSE — store inside `evolution_snapshots.metrics` jsonb as a `{group_key → fingerprint}` map.	jsonb holds the per-group map; no per-group row explosion; one snapshot row per run (§38.4).
Ruleset-version registry	NEW tiny `governance_ruleset` registry row (recommended, Option B) — or REUSE `evolution_snapshots` `scope='governance.ruleset'` (Option A fallback).	Ruleset version is a different grain (aggregate version of the whole active rule-set) and needs governable semantics (owner, activation approval, status) for C-7. A tiny additive ROW registry, modeled on the live `measurement_registry`/`derived_objects_registry` row-registry idiom, is the cleanest. Option A avoids any new table at the cost of weaker governance semantics. Council (C-7) chooses; default = B.
Hash inputs (rules-as-data)	REUSE `measurement_registry` + (when live) profile/axis/scope registries.	Rules are already data; `ruleset_version` is a deterministic content hash, not new rule storage.
Audit	REUSE `registry_changelog`.	One audit channel; no third log minted.

Net: SB-12 = 0 new tables for snapshots (reuse evolution_snapshots) + at most 1 new tiny registry for rulesets (governance_ruleset, or reuse evolution_snapshots). Minimal additive footprint; no island.

38.4 Source-snapshot identity, capture, and structure

Identity

source_snapshot_ref = evolution_snapshots.id (integer). A candidate-state row (SB-10) stores this integer. A snapshot is immutable once written (append-only; never updated).

Row shape (reusing `evolution_snapshots`)

evolution_snapshots
  id            int            -- = source_snapshot_ref carried by candidate-state rows
  snapshot_at   timestamptz    -- capture time (run start)
  scope         text           -- 'governance.backfill' | 'governance.audit' | 'governance.scan'
  metrics       jsonb          -- { run_id, ruleset_version, totals:{...},
                               --   groups: { <group_key>: { count, max_born_at, max_id,
                               --                              content_hash, source_registry } } }
  delta_previous jsonb         -- per-group delta vs prior snapshot of same scope (drift map)
  notes         text           -- worker_name, batch range, phase

Capture

Captured at run start by the backfill sweep (doc 31, phase='seeding'/'reconciling') and by every periodic full audit (doc 34 §5 safety net). Incremental/event-driven scans (doc 34) reference the most recent full-audit snapshot for their scope and compute only the affected group's fingerprint delta.
Per-group fingerprint computed by keyset aggregate over the authoritative inventory at governance grain: count(*), max(born_at), max(id), and a content_hash = md5(string_agg(entity_code ORDER BY id)) (or a rolling hash) per group_key. For 1.04M rows this is a GROUP BY over the group dimensions; executed in batches within the 5 s read timeout (Branch F §4 control #1).
delta_previous = for each group, {count_delta, new_max_id, fingerprint_changed:bool} vs the prior same-scope snapshot. This is the drift map that drives invalidation (§38.6).

Why per-group-in-jsonb, not per-group-row

The live evolution_snapshots is one coarse global row. Per-group rows would re-introduce row explosion (78 registries × N classes × axes). The jsonb groups map keeps it one row per run (Δrows ≈ number of runs, not number of groups) while remaining queryable (metrics->'groups'->><group_key>). This honors the "no per-row explosion" constraint (doc 34 §3, M-DEF-7).

38.5 Ruleset-version identity and versioning without law version-bump

Identity (content-addressed)

ruleset_version = 'gov-rs-' || left(
   sha256( canonical_json([
       -- component 1: enabled governance-relevant detector rows
       measurement_registry rows WHERE enabled ORDER BY measurement_id
         → (measurement_id, law_code, method, source_query, target_query, comparison, severity),
       -- component 2: coverage-profile registry (M-DEF-2)   [when live]
       -- component 3: Axis Registry (M-DEF-9)                [when live; interim: pivot_definitions+law_jurisdiction]
       -- component 4: governance_responsibility_scope rows   [SB-2; when live]
   ]) ), 12)

Canonical ordering is mandatory (ORDER BY each component's key) so the hash is deterministic; otherwise row-order noise causes spurious bumps (§38.7 failure mode).
Components that are absent (profile/axis/scope pre-SB-2/SB-3) hash as an explicit empty marker {component:'X', state:'absent'} — never silently skipped — so a later activation of that component deterministically bumps the ruleset (and raises axis_unregistered/inventory findings, never a hidden default).

Registry row (Option B, recommended)

governance_ruleset                                   -- NEW, additive, greenfield
  ruleset_version   text   PK   -- the 'gov-rs-<hash>' string above
  content_hash      text         -- the raw sha256
  components        jsonb         -- { measurement_ids:[...], profile_ver, axis_ver, scope_ver,
                                  --   law_codes:[...], absent:[...] }  (provenance of what was hashed)
  activated_at      timestamptz
  activated_by      text          -- owner / approver (C-7)
  approval_ref      text          -- APR id when activation is approved (Đ32) — design-only
  status            text          -- 'active' | 'superseded' | 'draft'
  supersedes        text          -- prior ruleset_version (immutable version chain)
  notes             text

Exactly one status='active' ruleset per governance scope-family at a time (partial-unique by convention; intended unique key stated here, to be confirmed at build by an operator with full privileges — live PK introspection returned empty for the read-only role).

Versioning is operational, not legislative — the no-law-bump guarantee

A ruleset bump is a config-version event, recorded in governance_ruleset + registry_changelog. It does not write normative_registry, law_catalog, or governance_docs; it does not enact, version-bump, or change the status of any law.
The ruleset references the laws it encodes via measurement_registry.law_code and components.law_codes[] (traceability), but the law version lifecycle stays entirely inside the law substrate (L-1/L-2). This cleanly separates "the rules changed because a law was amended" (a law event, upstream) from "the active detector set was re-versioned" (an operational SB-12 event, downstream). Acceptance test §38.10 #4 asserts normative_registry is unchanged across a ruleset bump.

Owner of ruleset → C-7 (open)

Default proposed: policy ownership = GOV-COUNCIL (a ruleset bump that changes detection scope is a policy act); GOV-SIV proposes the bump (it computes the hash and the diff); activation via APR quorum (Đ32, fn_apr_quorum_check). Auto-activation on pure additive measurement rows MAY be allowed via an allowlist (cf. doc 27 auto-approve hardening) — council decision.
Until C-7 rules, governance_ruleset rows may be computed and recorded (draft) but not activated; a candidate verdict referencing a draft ruleset is treated as unknown for high-risk objects (fail-closed). Unowned ruleset → finding ruleset_unowned.

38.6 Relationship to source registry / group / axis / lifecycle, and what invalidates a verdict

group_key (doc 34 §3) = hash(object_class, source_registry, axis_family, scope, lifecycle_status, owner_scope). SB-12 attaches a fingerprint to each group in a snapshot, and a single ruleset hash to the run.

A prior verdict is invalidated when (and only when):

Change	Detected via	Invalidates
Source drift — group's row-set changed (count, new max born_at/id, content hash)	`delta_previous` map in next snapshot; handoff kinds #1–#6 (doc 32 §3)	only the drifted group(s) — `dirty=true, dirty_reason='snapshot_drift'` on those candidate rows
Axis introduced / axis policy changed	handoff #7; Axis Registry / interim `pivot_definitions` change → ruleset component 3 changes	groups in the affected `axis_family` (ruleset bump scoped to axis)
Policy changed (law amended → `measurement_registry` row enabled/edited; M-DEF-2 profile changed)	handoff #8; ruleset component 1/2 changes	groups in the changed rule's scope (not blanket)
Owner/approval/exception changed	handoff #9 (SB-2/Đ32)	`owner_scope` + `scope` groups
Ruleset version bump (any hashed component changes)	new `governance_ruleset` active row; `ruleset_version` differs from candidate row's	groups in the changed rule's scope; auto-close re-keyed by `(coalesce_key, ruleset_version)` (doc 34 §9, doc 35 §3.2 patch #8)
TTL expiry	SB-10 `stale_after`	the single expired candidate (time-based; not an SB-12 concern but consumes SB-12 to re-fingerprint on re-scan)

Targeted, never blanket. Snapshot drift dirties only groups whose fingerprint changed (delta_previous); a ruleset bump dirties only groups whose governing rule changed (the components.law_codes/scope mapping decides which). The whole point is Δother = 0 (acceptance §38.10 #2/#3).

38.7 Audit, retention, failure modes

Audit trail

Snapshot capture → registry_changelog row (entity_type='governance_snapshot', entity_code=evolution_snapshots.id, action='capture', alert_level='info', changed_by=worker_name) + the evolution_snapshots row itself is the durable artifact.
Ruleset activation → registry_changelog (entity_type='governance_ruleset', entity_code=ruleset_version, action='activate', changed_by=activated_by) + governance_ruleset.approval_ref.
No third audit channel. governance_audit_log (relation-scoped, 1 stale row) stays untouched (consistent with SB-7 decision, doc 25).

Retention

Snapshots: retain every snapshot referenced by any live candidate-state row (soft-FK on source_snapshot_ref). Beyond that, retain at least the last full-audit snapshot per scope + the last N (config, default 12) full-audit cycles for drift history. Prune older unreferenced snapshots only via a logged summary (snapshot_pruned count); never silent (constitution "no silent caps").
Rulesets: retained forever (immutable version chain via supersedes); status flips active→superseded, rows never deleted. This is the reproducibility ledger: "verdict was computed under gov-rs-abc123" must always resolve.

Failure modes

Mode	Behavior (fail-closed)
Snapshot capture fails mid-run	Cursor (SB-13) does not advance; partial snapshot row not referenced; retry from watermark; raise `snapshot_capture_failed` (high).
Non-deterministic hash (ordering / clock)	Mandatory canonical `ORDER BY` + content-only hashing (no timestamps in hash). If two captures of an unchanged group differ → raise `snapshot_nondeterministic` (high); freeze ruleset bumps until resolved.
Missing source registry for a group	Cannot fingerprint → raise `backfill_inventory_gap` (doc 31), fail closed for that group; group stays `unknown` (high-risk → G-PROD blocks). Never invents a fingerprint.
Dangling `source_snapshot_ref` (candidate references pruned snapshot)	Treat candidate as `stale/unknown` (fail-closed); schedule re-scan; raise `snapshot_ref_dangling` (medium). Retention rule above prevents this for live candidates.
Ruleset unowned (C-7 unresolved)	Rulesets computable as `draft`; verdicts under draft ruleset = `unknown` for high-risk; raise `ruleset_unowned`.
Component absent then later present (Axis Registry born)	Deterministic ruleset bump (absent-marker → present) dirties affected groups; no hidden default.

38.8 Relation to backfill (doc 31) and dirty-group candidate scan (doc 34)

Backfill (doc 31 §5): the sweep captures the seed snapshot at phase='seeding' start and computes the initial ruleset_version; every seeded candidate row carries (source_snapshot_ref, ruleset_version). SB-12 is the registry those refs point at.
Dirty-group (doc 34 §2/§4): the candidate-state store keys verdicts on the SB-12 triple; invalidation triggers (§38.6) consume the snapshot delta_previous and the ruleset_version diff to decide which groups go dirty. The periodic full audit (doc 34 §5) re-captures a snapshot and re-hashes the ruleset — the safety net that catches any lost dirty signal.
Handoff (doc 32 §3 #8/#10): policy_changed/ruleset_changed handoff signals are exactly the triggers that cause a ruleset_version recompute; SB-12 turns them into a scoped invalidation rather than a blanket rescan.

38.9 No-hardcode / no-island attestation

No-hardcode: snapshot scopes, group dimensions, and ruleset components are all discovered from registries (evolution_snapshots.scope is data; group dims from meta_catalog/birth_registry/pivot_definitions/law_jurisdiction/SB-2 scopes; ruleset hash over measurement_registry rows). No axis array, no object-class list, no rule literal in code. Absent components hash as explicit absent-markers, not silent defaults.
No-island: snapshots reuse the one evolution_snapshots ledger; audit reuses the one registry_changelog; rules reuse the one measurement_registry. The only additive object is the tiny governance_ruleset registry (Option B) — a config-version registry, not a parallel governance roof. No second snapshot store, no second rule store, no second audit log.

38.10 Acceptance tests (must pass at build; cannot run now — design-only)

Reproducibility: same enabled measurement_registry subset + same component versions ⇒ identical ruleset_version hash across repeated computation (determinism). A recorded verdict resolves to a human-readable statement: "object X was not_relevant under ruleset gov-rs-abc123, snapshot id=512 (captured 2026-06-10)."
Targeted snapshot invalidation: mutate one group's inventory (add rows) ⇒ next snapshot's delta_previous flags only that group; only that group's candidate rows go dirty; Δother_groups = 0.
Targeted ruleset invalidation: enable one measurement_registry row scoped to scope S ⇒ ruleset_version bumps; only groups in S are dirtied; auto-close keyed by (coalesce_key, ruleset_version) does not mask the re-open.
No law version-bump: across a ruleset bump, normative_registry, law_catalog, governance_docs row-sets and versions are byte-identical (the bump is purely operational).
Scale: per-group fingerprint over 1,037,724 born rows / 78 registries completes within the batch budget (each GROUP BY batch < 5 s read timeout).
Fail-closed: a missing source registry / non-deterministic hash / dangling snapshot ref yields unknown/finding and blocks G-PROD for high-risk objects, never a silent "still clean."
Audit completeness: every snapshot capture and ruleset activation has a matching registry_changelog row; pruning emits a summary count.

38.11 Dependencies, gates, and verdict

Designable now: YES (done, this doc). Build now: NO.
Build gates: (a) evolution_snapshots reuse needs no approval but writing governance rows is part of the GCOS build (gated with SB-10/SB-13); (b) governance_ruleset table creation = DDL = gated (operator + reversible-by-default, law §5); (c) ruleset ownership + activation policy = C-7 (council); (d) ruleset hash components that depend on SB-2 (governance_responsibility_scope) and SB-3/Axis Registry degrade gracefully until those land (absent-markers), so SB-12 is not hard-blocked by SB-2/SB-3 for design or for a born-only initial ruleset.
No COMMIT (os_proposal_approvals=0 ⇒ COMMIT_FORBIDDEN, H-1/H-2/SB-6).

SB-12 design verdict: COMPLETE — GO for build-prep, BUILD NO-GO. Source-snapshot = reuse evolution_snapshots (zero schema change). Ruleset-version = new tiny governance_ruleset registry (Option B, recommended) or reuse evolution_snapshots scope='governance.ruleset' (Option A) — C-7 owns the choice + ownership. Reproducible verdicts and targeted (never blanket) invalidation are achievable on the live substrate with minimal additive footprint and zero law-version coupling.

(Cross-refs: doc 31 §5, doc 34 §2/§4/§9, doc 35 §3.2 patch #8 + Branch F controls #5/#8/#10, doc 39 SB-13, doc 40 SB-10, doc 41 SB-11, doc 42 integration.)