KB-6718

39 — SB-13 Governance Worker-Cursor Family — Detailed Technical Design (GCOS, design-only, read-only zero mutation, 2026-06-01)

20 min read Revision 1
one-roof-governanceimplementation-indexgcossb-13worker-cursoriu-route-worker-cursorqueue-heartbeatjob-queuekeyset-paginationlease-modelretry-dlqresumableidempotencytype-generalized-watermarkbirth-registry-int-vs-uuidreuse-extend-newno-hardcodeno-islanddesign-onlybuild-no-go2026-06-01

39 — SB-13 Governance Worker-Cursor Family — Detailed Technical Design

Package: knowledge/dev/reports/architecture/one-roof-governance-technical-addendum-and-implementation-index-2026-06-01/ Track: GCOS substrate. Blocker SB-13. Status: Detailed technical design ONLY. BUILD NO-GO. No DDL/DML, no worker start, no cron enable, no table creation. KB document only. Reads / controls: doc 00 (controlling) → concept canon → Round-4 law → prompt-muc-tieu-mo-for-claude-code.md. Builds on docs 31 (§4 cursor), 32 (§5 intake worker), 34 (§5 scan modes), 35 (Branch F controls). Đ45 Queue Law (enacted 2026-05-26) binds. Date: 2026-06-01 · Mutation footprint: KB document only. Zero PG/Directus/Qdrant/Nuxt mutation.


39.0 §0-GOV — governed objects

governed_object class grain purpose
gov_worker_cursor Class-2 process record one row per (worker, source) resumable keyset watermark + counters
gov_worker_heartbeat Class-2 process record (reuses queue_heartbeat) one row per worker liveness / lease / silent-gap (Đ45)

Issue/event types (register-before-emit, NOT registered; ride governance domain, SB-11): cursor_lag, cursor_dlq, cursor_silent_gap, cursor_lease_conflict, cursor_watermark_regression. (These align with doc 32 §8 handoff_lag/handoff_dlq/handoff_silent_gap and doc 34 §10 candidate_scan_lag.)


39.1 Problem statement

The GCOS workers (backfill sweep, handoff intake, input gate, candidate scan, periodic full audit) must be resumable (crash/restart loses nothing, double-commits nothing), bounded (5 s read timeout, 500-row LIMIT, 1.04M rows), observable (lag, DLQ, liveness), and safe under concurrency (one owner per scope). SB-13 is the cursor/checkpoint substrate that provides keyset watermarks, retry/DLQ counters, a lease, and a heartbeat — reusing the live iu_route_worker_cursor + queue_heartbeat + event_pending substrate, not minting a new worker framework.


39.2 Live PG validation (read-only, re-verified 2026-06-01)

Object Live finding Implication for SB-13
iu_route_worker_cursor 1 row (worker_name='iu_outbound_default', event_domain='iu', last_event_id=8c04f5f7-…(uuid), events_seen=68, attempts_written=67, dead_lettered=0, metadata={scope:'event_domain=iu only'}). Cols: worker_name text NN, event_domain text NN default 'iu', last_created_at tstz, last_event_id uuid, last_run_at, last_run_summary jsonb NN, events_seen/attempts_written/dead_lettered bigint NN, metadata jsonb NN, created_at/updated_at. The cursor shape is the right anchor — BUT last_event_id is uuid. It tails event_outbox (uuid PK). It cannot carry a birth_registry/registry_changelog watermark (those ids are integer). → TYPE-GENERALIZATION REQUIRED (§39.4). The cited "reuse 1:1" (doc 31 §4) is incorrect on the watermark type — live wins.
birth_registry.id integer, NOT NULL. born_at tstz nullable but 0 nulls in 1,037,724 rows. canonical_address NULL in ALL 1,037,724 rows. entity_code/collection_name NOT NULL, 0 nulls. Keyset watermark over birth = (born_at, id) with id (int, monotone, NN) as the stable tie-breaker / primary. canonical_address is unusable as a key (universally null) — idempotency must use (collection_name, entity_code) (see SB-10 doc 40, doc 31 §6 correction).
registry_changelog.id / .timestamp id integer NN; timestamp is timestamp WITHOUT time zone NN. Second tail source; watermark (timestamp, id) — again integer id, non-tz time → reinforces type-generalized watermark.
event_outbox.id uuid. Third tail source (lifecycle/handoff emits) — uuid watermark. The family spans int and uuid watermarks → text-encoded watermark is the only uniform fit.
queue_heartbeat 3 rows (cut_pipeline_operator/external_worker, dieu45_phase3_pilot/external_worker, iu_outbound_default/PG_worker, last_tick_status ∈ {ok, warn}). Cols: executor_name text NN, executor_kind text NN, last_tick_at tstz NN, last_tick_status text NN 'ok', ticks_total bigint NN, current_job_id uuid, lease_owner text, metadata jsonb NN, created_at/updated_at. Live heartbeat + lease substrate (Đ45 silent-gap). lease_owner + current_job_id give the lease model for free; dieu45_phase3_pilot proves Đ45 is in active pilot. SB-13 reuses queue_heartbeat for liveness + lease — no new heartbeat table.
job_queue 13 rows (cut.*, states queued/succeeded). Cols incl. lease_owner text, lease_until tstz, attempts int NN, max_attempts int NN, idempotency_key text NN, run_id uuid, last_error, priority, scheduled_at. The live lease/retry/idempotency idiom to mirror (lease_owner+lease_until; attempts/max_attempts; idempotency_key). Boundary (Đ45 event≠job): job_queue is for JOBS; GCOS handoff signals do NOT go here. SB-13 borrows its patterns, not its rows.
event_pending 0 rows, structure present. Cols: event_domain, event_type_hint, entity_table, entity_ref, canonical_address NN, actor_ref NN, capture_payload jsonb, processed_at, error_count int NN 0, last_error text. Live per-item retry/staging substrate, unused → available. SB-13 reuses it for per-item retry + DLQ staging.

PK note: information_schema PK introspection returned empty for these tables under the read-only role (privilege artifact — consistent with the known information_schema.triggers emptiness). Intended unique keys are stated explicitly below, to be confirmed at build by an operator with full privileges.


39.3 Reuse / Extend / New decision

Component Decision Rationale
Cursor table NEW gov_worker_cursor, modeled 1:1 on iu_route_worker_cursor EXCEPT a type-generalized watermark Reusing the live iu_route_worker_cursor table directly is blocked by the uuid watermark type (can't tail int-keyed birth/changelog) and would pollute the IU cursor with governance rows. A new table with the same columns + a text watermark is reuse-of-shape without migration risk to the live IU worker. (See §39.4 for why not just ALTER the live table.)
Lease / lock REUSE queue_heartbeat (lease_owner, current_job_id, last_tick_at) Live lease+heartbeat substrate already exists and is in use; minting a lock table = second roof.
Heartbeat / silent-gap REUSE queue_heartbeat Đ45 silent-gap heartbeat is exactly what queue_heartbeat.last_tick_at/last_tick_status provides.
Per-item retry / DLQ staging REUSE event_pending (error_count, last_error, processed_at) + gov_worker_cursor.dead_lettered counter Unused live table with the exact retry shape.
Job execution DO NOT REUSE job_queue for signals (Đ45 event≠job); borrow its lease/idempotency idiom only Governance remediation is an APR job (T6/T7), not a cursor concern.

Net: SB-13 = 1 new table (gov_worker_cursor) + reuse of queue_heartbeat (lease/heartbeat) + event_pending (retry). Could be reduced to 0 new tables only if the council accepts ALTER-ing iu_route_worker_cursor to generalize the watermark — not recommended (touches a live production worker; reversibility/risk). The new-but-reuse-shaped table is the safer reuse-first choice.


39.4 Worker identity, cursor key, and the type-generalized watermark (the core fix)

gov_worker_cursor (NEW, additive)

gov_worker_cursor
  worker_name        text  NN     -- 'gov_backfill_sweep' | 'gov_handoff_intake' | 'gov_input_gate'
                                   --   | 'gov_candidate_scan' | 'gov_periodic_full_audit'
  source_name        text  NN     -- 'birth_registry' | 'registry_changelog' | 'event_outbox' | ...
  event_domain       text  NN     -- 'governance' (default)
  -- TYPE-GENERALIZED KEYSET WATERMARK (the SB-13 correction) ----------------
  last_watermark_ts  timestamptz  -- = source's order timestamp (born_at | timestamp | occurred_at)
  last_watermark_id  text         -- = source PK rendered as text (int→text, uuid→text); uniform
  -- ------------------------------------------------------------------------
  last_run_at        timestamptz
  last_run_summary   jsonb NN default '{}'
  events_seen        bigint NN default 0
  attempts_written   bigint NN default 0
  dead_lettered      bigint NN default 0
  phase              text         -- 'seeding' | 'reconciling' | 'incremental'  (doc 31 §4)
  metadata           jsonb NN default '{}'  -- { ruleset_version, source_snapshot_ref, batch_size, coalesce_window, sources[] }
  created_at         timestamptz NN default now()
  updated_at         timestamptz NN default now()
  -- intended UNIQUE: (worker_name, source_name)

Why last_watermark_id text instead of the live uuid: the governance family must tail three different PK typesbirth_registry.id (int), registry_changelog.id (int), event_outbox.id (uuid). A single column can only be uniform if it is text. Comparison in the keyset query casts the source PK to text with a stable collation/zero-padding for integer sources so > ordering is correct (integers compared numerically via the (ts, id) tuple where ts is primary and id only breaks ties at identical timestamps; for safety the keyset uses the numeric column directly in the SQL predicate and only stores the watermark as text). This keeps one cursor family for all sources without per-source columns (no-hardcode).

Keyset (seek) pagination — never OFFSET

-- birth_registry (int id, born_at fully populated but column nullable → guard)
SELECT id, born_at, collection_name, entity_code
FROM birth_registry
WHERE born_at IS NOT NULL
  AND (born_at, id) > (:last_ts, :last_id_int)
ORDER BY born_at, id
LIMIT :batch_size;     -- 2k–5k (Branch F #1)
  • Primary watermark = id (int, NN, monotone) for birth; born_at is the order key but id guarantees uniqueness and NOT-NULL safety (the column is nominally nullable; the worker filters born_at IS NOT NULL and relies on id as the strict tie-breaker). For registry_changelog: (timestamp, id). For event_outbox: (occurred_at, id::text).
  • Guarantees every row visited exactly once, lossless even as new rows arrive (append-only sources). Cursor advances only after the batch is durably processed (candidate-state writes committed).

Cursor key (idempotency at the item grain)

Per processed item, the durable effect (a candidate-state upsert, SB-10) is keyed by (candidate_key, ruleset_version) where candidate_key = COALESCE(canonical_address, collection_name || ':' || entity_code). Because canonical_address is NULL for all 1.04M birth rows (live), the effective key is collection_name || ':' || entity_code. Re-processing the same row is a no-op upsert → resume is safe.


39.5 The five workers (one cursor family, no bespoke framework)

worker_name source(s) tailed watermark dirties / writes (gated) doc
gov_backfill_sweep birth_registry (born_at, id) seeds candidate-state rows; phase seeding→reconciling→incremental 31
gov_handoff_intake birth_registry + registry_changelog (+ event_outbox for lifecycle) per-source (ts, id) dirty-marks on candidate-state; capture to event_pending if event type unregistered 32
gov_input_gate candidate-state rows pending input_quality_state candidate scan_time/seq writes input_quality_state verdict 33
gov_candidate_scan candidate-state dirty=true + stale_after-expired set candidate seq writes candidate_verdict, clears dirty 34
gov_periodic_full_audit full inventory (bounded reconciliation) full-sweep checkpoint re-captures snapshot (SB-12), re-validates every candidate past TTL 34 §5

Each is a GOV-SIV Tier-A read/propose worker (doc 35 §3.4 DOTs); the only mutating member is T6's dot_governance_assignment_apply (GOV-DOT, NO-GO). Each is paired with a test DOT (Đ35 A/B).


39.6 Lock / lease, pause/resume, checkpoint payload

  • Lease (reuse queue_heartbeat): before processing, a worker claims queue_heartbeat(executor_name=worker_name, executor_kind='PG_worker'|'external_worker', lease_owner=<instance-id>). One owner per (worker_name, source) — a second instance seeing a live lease_owner with a fresh last_tick_at backs off (cursor_lease_conflict if it persists). Lease expiry = last_tick_at older than a TTL (mirrors job_queue.lease_until idiom). This gives single-writer-per-scope without a new lock table.
  • Heartbeat (Đ45 silent-gap): worker ticks queue_heartbeat.last_tick_at/ticks_total each batch; emits governance.<worker>.heartbeat (SB-11). A missed heartbeat beyond threshold = cursor_silent_gap (high→critical) — "quiet vs dead" distinguishable.
  • Pause/resume: pausing = stop ticking/processing; the committed (last_watermark_ts, last_watermark_id) is the durable resume point. Resume re-claims the lease and re-seeks from the watermark. No state lost, none double-committed (idempotent upserts §39.4).
  • Checkpoint payload: last_run_summary jsonb = {batch_count, rows_seen, dirty_marked, dlq, snapshot_ref, ruleset_version, started_at, ended_at}. metadata carries config (batch_size, coalesce_window, sources). The periodic-full worker additionally checkpoints its sweep position so a multi-hour audit resumes mid-sweep.

39.7 Retry counters, DLQ, idempotency, coalesce, resource budget, observability

  • Retry (reuse event_pending): a failing item is staged to event_pending (error_count++, last_error); bounded exponential backoff up to N attempts. The cursor does not advance past an unretired low-water item silently.
  • DLQ: after N attempts → gov_worker_cursor.dead_lettered++ + raise cursor_dlq/handoff_dlq (doc 32 §8); sweep continues. DLQ over threshold → *_dlq_overflow (high→critical). Dead-letter, never drop → worst case is delayed + visible, never missing (doc 32 §6 no-lost-handoff).
  • Idempotency: consume-once keyed by (source, watermark); durable effect keyed by (candidate_key, ruleset_version) (upsert). Replay = reset watermark and re-tail; safe because effects are idempotent.
  • Coalesce: within a poll/coalesce window, multiple changes to the same group_key collapse to one dirty-mark (reuse T7 coalesce_key; doc 32 §6). A dirty-storm (one tick dirties > configured fraction of groups) → group_invalidation_storm + throttle (doc 34 §4, Branch F).
  • Resource budget (Branch F): batch 2k–5k (each read < 5 s); one worker per scope; parallelism only over disjoint group_key ranges; off-peak throttle + server-load guard (pause on lock-wait/replication-lag breach); no long-held read txns; no UI full-table scan (UI reads summary views only, Đ28).
  • Observability: gov_worker_cursor counters (events_seen, attempts_written, dead_lettered) + queue_heartbeat freshness + last_run_summary. Metrics surfaced: cursor lag (now − last_watermark_ts), dirty-queue depth, stale count, DLQ depth, heartbeat freshness, batches done/total. No silent caps — sampling/top-N/DLQ-truncation each emit a summary finding.

39.8 Failure modes

Mode Behavior
Crash mid-batch Restart re-claims lease, re-seeks from committed watermark; uncommitted batch re-processed idempotently.
Two instances race Lease conflict detected via queue_heartbeat.lease_owner + fresh last_tick_at; loser backs off; cursor_lease_conflict if stuck.
Watermark regression (clock skew / bad write) Reject watermark < current committed (monotone guard); raise cursor_watermark_regression (high); hold.
born_at null row (column nullable) Filtered by WHERE born_at IS NOT NULL; such rows (none today) routed to an id-only reconciliation pass so they are not silently skipped.
Source PK type surprise last_watermark_id text + per-source typed predicate handles int and uuid uniformly; a new source registers its (ts-col, id-col, id-type) in metadata (no-hardcode).
Lost dirty signal Periodic full audit (TTL net) re-validates within TTL — the guaranteed catch-all.

39.9 No-hardcode / no-island attestation

  • No-hardcode: worker names, source registries, watermark columns, batch sizes, coalesce windows are data (gov_worker_cursor.source_name/metadata), not code arrays. A new tail source = a new cursor row, not a code change. No object-class or axis list anywhere.
  • No-island: lease + heartbeat reuse the one queue_heartbeat; retry reuses the one event_pending; the cursor family mirrors the one live iu_route_worker_cursor shape. The only additive object is gov_worker_cursor (a governance-scoped twin, not a parallel worker framework). Jobs stay in job_queue (Đ45 boundary); signals stay in event_outbox (SB-11). No second queue, no second lock manager, no second heartbeat.

39.10 Acceptance tests (build-time; cannot run now)

  1. Resumability: kill a worker mid-sweep → restart processes exactly the unprocessed remainder; zero gap, zero double-commit (verified by candidate-state row count + idempotency-key uniqueness).
  2. Type generality: one gov_worker_cursor family successfully tails birth_registry (int id), registry_changelog (int id, non-tz ts), and event_outbox (uuid id) with correct keyset ordering each.
  3. Lease safety: two concurrent instances of the same worker → exactly one processes; the other backs off; no double-write.
  4. DLQ-not-drop: inject a poison item → after N retries it is dead-lettered + a finding raised; the sweep completes; the item is visible, not lost.
  5. Heartbeat: stop a worker → cursor_silent_gap fires within threshold; restart clears it.
  6. Scale: seed 1,037,724 rows in 2k–5k batches, each read < 5 s; no OFFSET; cursor advances monotonically.
  7. No-UI-scan: UI/ops reads only summary views; the raw cursor/candidate sweep is never exposed to a UI full-table query.

39.11 Dependencies, gates, verdict

  • Designable now: YES (done). Build now: NO.
  • Build gates: gov_worker_cursor DDL = gated (operator, reversible-by-default); writing queue_heartbeat/event_pending governance rows = part of GCOS build (gated with SB-10/SB-11); running any worker requires the governance event domain active (SB-11) and the candidate-state store (SB-10). C-7 owns the observer-trigger question (doc 32 §4 Option B) — not a cursor concern (default Option A cursor-tail needs no trigger).
  • No COMMIT (os_proposal_approvals=0).

SB-13 design verdict: COMPLETE — GO for build-prep, BUILD NO-GO. Decision = NEW gov_worker_cursor (reuse-shaped, type-generalized watermark) + REUSE queue_heartbeat (lease+heartbeat) + REUSE event_pending (retry/DLQ). The single material correction to the prior GCOS docs: the live iu_route_worker_cursor.last_event_id is uuid, incompatible with the int-keyed birth_registry/registry_changelog; the watermark must be type-generalized (text + typed predicate), and the idempotency key must be (collection_name, entity_code) because canonical_address is universally NULL in birth_registry.

(Cross-refs: doc 31 §4/§6, doc 32 §5/§6, doc 34 §5, doc 35 §3.4 + Branch F #1/#2/#9/#11/#12, doc 38 SB-12, doc 40 SB-10, doc 41 SB-11.)

Back to Knowledge Hub knowledge/dev/reports/architecture/one-roof-governance-technical-addendum-and-implementation-index-2026-06-01/39-sb13-governance-worker-cursor-family-detailed-design.md