39 — SB-13 Governance Worker-Cursor Family — Detailed Technical Design (GCOS, design-only, read-only zero mutation, 2026-06-01)
39 — SB-13 Governance Worker-Cursor Family — Detailed Technical Design
Package:
knowledge/dev/reports/architecture/one-roof-governance-technical-addendum-and-implementation-index-2026-06-01/Track: GCOS substrate. Blocker SB-13. Status: Detailed technical design ONLY. BUILD NO-GO. No DDL/DML, no worker start, no cron enable, no table creation. KB document only. Reads / controls: doc 00 (controlling) → concept canon → Round-4 law →prompt-muc-tieu-mo-for-claude-code.md. Builds on docs 31 (§4 cursor), 32 (§5 intake worker), 34 (§5 scan modes), 35 (Branch F controls). Đ45 Queue Law (enacted 2026-05-26) binds. Date: 2026-06-01 · Mutation footprint: KB document only. Zero PG/Directus/Qdrant/Nuxt mutation.
39.0 §0-GOV — governed objects
| governed_object | class | grain | purpose |
|---|---|---|---|
gov_worker_cursor |
Class-2 process record | one row per (worker, source) | resumable keyset watermark + counters |
gov_worker_heartbeat |
Class-2 process record (reuses queue_heartbeat) |
one row per worker | liveness / lease / silent-gap (Đ45) |
Issue/event types (register-before-emit, NOT registered; ride governance domain, SB-11): cursor_lag, cursor_dlq, cursor_silent_gap, cursor_lease_conflict, cursor_watermark_regression. (These align with doc 32 §8 handoff_lag/handoff_dlq/handoff_silent_gap and doc 34 §10 candidate_scan_lag.)
39.1 Problem statement
The GCOS workers (backfill sweep, handoff intake, input gate, candidate scan, periodic full audit) must be resumable (crash/restart loses nothing, double-commits nothing), bounded (5 s read timeout, 500-row LIMIT, 1.04M rows), observable (lag, DLQ, liveness), and safe under concurrency (one owner per scope). SB-13 is the cursor/checkpoint substrate that provides keyset watermarks, retry/DLQ counters, a lease, and a heartbeat — reusing the live iu_route_worker_cursor + queue_heartbeat + event_pending substrate, not minting a new worker framework.
39.2 Live PG validation (read-only, re-verified 2026-06-01)
| Object | Live finding | Implication for SB-13 |
|---|---|---|
iu_route_worker_cursor |
1 row (worker_name='iu_outbound_default', event_domain='iu', last_event_id=8c04f5f7-…(uuid), events_seen=68, attempts_written=67, dead_lettered=0, metadata={scope:'event_domain=iu only'}). Cols: worker_name text NN, event_domain text NN default 'iu', last_created_at tstz, last_event_id uuid, last_run_at, last_run_summary jsonb NN, events_seen/attempts_written/dead_lettered bigint NN, metadata jsonb NN, created_at/updated_at. |
The cursor shape is the right anchor — BUT last_event_id is uuid. It tails event_outbox (uuid PK). It cannot carry a birth_registry/registry_changelog watermark (those ids are integer). → TYPE-GENERALIZATION REQUIRED (§39.4). The cited "reuse 1:1" (doc 31 §4) is incorrect on the watermark type — live wins. |
birth_registry.id |
integer, NOT NULL. born_at tstz nullable but 0 nulls in 1,037,724 rows. canonical_address NULL in ALL 1,037,724 rows. entity_code/collection_name NOT NULL, 0 nulls. |
Keyset watermark over birth = (born_at, id) with id (int, monotone, NN) as the stable tie-breaker / primary. canonical_address is unusable as a key (universally null) — idempotency must use (collection_name, entity_code) (see SB-10 doc 40, doc 31 §6 correction). |
registry_changelog.id / .timestamp |
id integer NN; timestamp is timestamp WITHOUT time zone NN. |
Second tail source; watermark (timestamp, id) — again integer id, non-tz time → reinforces type-generalized watermark. |
event_outbox.id |
uuid. |
Third tail source (lifecycle/handoff emits) — uuid watermark. The family spans int and uuid watermarks → text-encoded watermark is the only uniform fit. |
queue_heartbeat |
3 rows (cut_pipeline_operator/external_worker, dieu45_phase3_pilot/external_worker, iu_outbound_default/PG_worker, last_tick_status ∈ {ok, warn}). Cols: executor_name text NN, executor_kind text NN, last_tick_at tstz NN, last_tick_status text NN 'ok', ticks_total bigint NN, current_job_id uuid, lease_owner text, metadata jsonb NN, created_at/updated_at. |
Live heartbeat + lease substrate (Đ45 silent-gap). lease_owner + current_job_id give the lease model for free; dieu45_phase3_pilot proves Đ45 is in active pilot. SB-13 reuses queue_heartbeat for liveness + lease — no new heartbeat table. |
job_queue |
13 rows (cut.*, states queued/succeeded). Cols incl. lease_owner text, lease_until tstz, attempts int NN, max_attempts int NN, idempotency_key text NN, run_id uuid, last_error, priority, scheduled_at. |
The live lease/retry/idempotency idiom to mirror (lease_owner+lease_until; attempts/max_attempts; idempotency_key). Boundary (Đ45 event≠job): job_queue is for JOBS; GCOS handoff signals do NOT go here. SB-13 borrows its patterns, not its rows. |
event_pending |
0 rows, structure present. Cols: event_domain, event_type_hint, entity_table, entity_ref, canonical_address NN, actor_ref NN, capture_payload jsonb, processed_at, error_count int NN 0, last_error text. |
Live per-item retry/staging substrate, unused → available. SB-13 reuses it for per-item retry + DLQ staging. |
PK note: information_schema PK introspection returned empty for these tables under the read-only role (privilege artifact — consistent with the known information_schema.triggers emptiness). Intended unique keys are stated explicitly below, to be confirmed at build by an operator with full privileges.
39.3 Reuse / Extend / New decision
| Component | Decision | Rationale |
|---|---|---|
| Cursor table | NEW gov_worker_cursor, modeled 1:1 on iu_route_worker_cursor EXCEPT a type-generalized watermark |
Reusing the live iu_route_worker_cursor table directly is blocked by the uuid watermark type (can't tail int-keyed birth/changelog) and would pollute the IU cursor with governance rows. A new table with the same columns + a text watermark is reuse-of-shape without migration risk to the live IU worker. (See §39.4 for why not just ALTER the live table.) |
| Lease / lock | REUSE queue_heartbeat (lease_owner, current_job_id, last_tick_at) |
Live lease+heartbeat substrate already exists and is in use; minting a lock table = second roof. |
| Heartbeat / silent-gap | REUSE queue_heartbeat |
Đ45 silent-gap heartbeat is exactly what queue_heartbeat.last_tick_at/last_tick_status provides. |
| Per-item retry / DLQ staging | REUSE event_pending (error_count, last_error, processed_at) + gov_worker_cursor.dead_lettered counter |
Unused live table with the exact retry shape. |
| Job execution | DO NOT REUSE job_queue for signals (Đ45 event≠job); borrow its lease/idempotency idiom only |
Governance remediation is an APR job (T6/T7), not a cursor concern. |
Net: SB-13 = 1 new table (gov_worker_cursor) + reuse of queue_heartbeat (lease/heartbeat) + event_pending (retry). Could be reduced to 0 new tables only if the council accepts ALTER-ing iu_route_worker_cursor to generalize the watermark — not recommended (touches a live production worker; reversibility/risk). The new-but-reuse-shaped table is the safer reuse-first choice.
39.4 Worker identity, cursor key, and the type-generalized watermark (the core fix)
gov_worker_cursor (NEW, additive)
gov_worker_cursor
worker_name text NN -- 'gov_backfill_sweep' | 'gov_handoff_intake' | 'gov_input_gate'
-- | 'gov_candidate_scan' | 'gov_periodic_full_audit'
source_name text NN -- 'birth_registry' | 'registry_changelog' | 'event_outbox' | ...
event_domain text NN -- 'governance' (default)
-- TYPE-GENERALIZED KEYSET WATERMARK (the SB-13 correction) ----------------
last_watermark_ts timestamptz -- = source's order timestamp (born_at | timestamp | occurred_at)
last_watermark_id text -- = source PK rendered as text (int→text, uuid→text); uniform
-- ------------------------------------------------------------------------
last_run_at timestamptz
last_run_summary jsonb NN default '{}'
events_seen bigint NN default 0
attempts_written bigint NN default 0
dead_lettered bigint NN default 0
phase text -- 'seeding' | 'reconciling' | 'incremental' (doc 31 §4)
metadata jsonb NN default '{}' -- { ruleset_version, source_snapshot_ref, batch_size, coalesce_window, sources[] }
created_at timestamptz NN default now()
updated_at timestamptz NN default now()
-- intended UNIQUE: (worker_name, source_name)
Why last_watermark_id text instead of the live uuid: the governance family must tail three different PK types — birth_registry.id (int), registry_changelog.id (int), event_outbox.id (uuid). A single column can only be uniform if it is text. Comparison in the keyset query casts the source PK to text with a stable collation/zero-padding for integer sources so > ordering is correct (integers compared numerically via the (ts, id) tuple where ts is primary and id only breaks ties at identical timestamps; for safety the keyset uses the numeric column directly in the SQL predicate and only stores the watermark as text). This keeps one cursor family for all sources without per-source columns (no-hardcode).
Keyset (seek) pagination — never OFFSET
-- birth_registry (int id, born_at fully populated but column nullable → guard)
SELECT id, born_at, collection_name, entity_code
FROM birth_registry
WHERE born_at IS NOT NULL
AND (born_at, id) > (:last_ts, :last_id_int)
ORDER BY born_at, id
LIMIT :batch_size; -- 2k–5k (Branch F #1)
- Primary watermark =
id(int, NN, monotone) for birth;born_atis the order key butidguarantees uniqueness and NOT-NULL safety (the column is nominally nullable; the worker filtersborn_at IS NOT NULLand relies onidas the strict tie-breaker). Forregistry_changelog:(timestamp, id). Forevent_outbox:(occurred_at, id::text). - Guarantees every row visited exactly once, lossless even as new rows arrive (append-only sources). Cursor advances only after the batch is durably processed (candidate-state writes committed).
Cursor key (idempotency at the item grain)
Per processed item, the durable effect (a candidate-state upsert, SB-10) is keyed by (candidate_key, ruleset_version) where candidate_key = COALESCE(canonical_address, collection_name || ':' || entity_code). Because canonical_address is NULL for all 1.04M birth rows (live), the effective key is collection_name || ':' || entity_code. Re-processing the same row is a no-op upsert → resume is safe.
39.5 The five workers (one cursor family, no bespoke framework)
| worker_name | source(s) tailed | watermark | dirties / writes (gated) | doc |
|---|---|---|---|---|
gov_backfill_sweep |
birth_registry |
(born_at, id) |
seeds candidate-state rows; phase seeding→reconciling→incremental |
31 |
gov_handoff_intake |
birth_registry + registry_changelog (+ event_outbox for lifecycle) |
per-source (ts, id) |
dirty-marks on candidate-state; capture to event_pending if event type unregistered |
32 |
gov_input_gate |
candidate-state rows pending input_quality_state |
candidate scan_time/seq |
writes input_quality_state verdict |
33 |
gov_candidate_scan |
candidate-state dirty=true + stale_after-expired set |
candidate seq | writes candidate_verdict, clears dirty |
34 |
gov_periodic_full_audit |
full inventory (bounded reconciliation) | full-sweep checkpoint | re-captures snapshot (SB-12), re-validates every candidate past TTL | 34 §5 |
Each is a GOV-SIV Tier-A read/propose worker (doc 35 §3.4 DOTs); the only mutating member is T6's dot_governance_assignment_apply (GOV-DOT, NO-GO). Each is paired with a test DOT (Đ35 A/B).
39.6 Lock / lease, pause/resume, checkpoint payload
- Lease (reuse
queue_heartbeat): before processing, a worker claimsqueue_heartbeat(executor_name=worker_name, executor_kind='PG_worker'|'external_worker', lease_owner=<instance-id>). One owner per(worker_name, source)— a second instance seeing a livelease_ownerwith a freshlast_tick_atbacks off (cursor_lease_conflictif it persists). Lease expiry =last_tick_atolder than a TTL (mirrorsjob_queue.lease_untilidiom). This gives single-writer-per-scope without a new lock table. - Heartbeat (Đ45 silent-gap): worker ticks
queue_heartbeat.last_tick_at/ticks_totaleach batch; emitsgovernance.<worker>.heartbeat(SB-11). A missed heartbeat beyond threshold =cursor_silent_gap(high→critical) — "quiet vs dead" distinguishable. - Pause/resume: pausing = stop ticking/processing; the committed
(last_watermark_ts, last_watermark_id)is the durable resume point. Resume re-claims the lease and re-seeks from the watermark. No state lost, none double-committed (idempotent upserts §39.4). - Checkpoint payload:
last_run_summaryjsonb ={batch_count, rows_seen, dirty_marked, dlq, snapshot_ref, ruleset_version, started_at, ended_at}.metadatacarries config (batch_size, coalesce_window, sources). The periodic-full worker additionally checkpoints its sweep position so a multi-hour audit resumes mid-sweep.
39.7 Retry counters, DLQ, idempotency, coalesce, resource budget, observability
- Retry (reuse
event_pending): a failing item is staged toevent_pending(error_count++,last_error); bounded exponential backoff up to N attempts. The cursor does not advance past an unretired low-water item silently. - DLQ: after N attempts →
gov_worker_cursor.dead_lettered+++ raisecursor_dlq/handoff_dlq(doc 32 §8); sweep continues. DLQ over threshold →*_dlq_overflow(high→critical). Dead-letter, never drop → worst case is delayed + visible, never missing (doc 32 §6 no-lost-handoff). - Idempotency: consume-once keyed by
(source, watermark); durable effect keyed by(candidate_key, ruleset_version)(upsert). Replay = reset watermark and re-tail; safe because effects are idempotent. - Coalesce: within a poll/coalesce window, multiple changes to the same
group_keycollapse to one dirty-mark (reuse T7coalesce_key; doc 32 §6). A dirty-storm (one tick dirties > configured fraction of groups) →group_invalidation_storm+ throttle (doc 34 §4, Branch F). - Resource budget (Branch F): batch 2k–5k (each read < 5 s); one worker per scope; parallelism only over disjoint
group_keyranges; off-peak throttle + server-load guard (pause on lock-wait/replication-lag breach); no long-held read txns; no UI full-table scan (UI reads summary views only, Đ28). - Observability:
gov_worker_cursorcounters (events_seen,attempts_written,dead_lettered) +queue_heartbeatfreshness +last_run_summary. Metrics surfaced: cursor lag (now −last_watermark_ts), dirty-queue depth, stale count, DLQ depth, heartbeat freshness, batches done/total. No silent caps — sampling/top-N/DLQ-truncation each emit a summary finding.
39.8 Failure modes
| Mode | Behavior |
|---|---|
| Crash mid-batch | Restart re-claims lease, re-seeks from committed watermark; uncommitted batch re-processed idempotently. |
| Two instances race | Lease conflict detected via queue_heartbeat.lease_owner + fresh last_tick_at; loser backs off; cursor_lease_conflict if stuck. |
| Watermark regression (clock skew / bad write) | Reject watermark < current committed (monotone guard); raise cursor_watermark_regression (high); hold. |
born_at null row (column nullable) |
Filtered by WHERE born_at IS NOT NULL; such rows (none today) routed to an id-only reconciliation pass so they are not silently skipped. |
| Source PK type surprise | last_watermark_id text + per-source typed predicate handles int and uuid uniformly; a new source registers its (ts-col, id-col, id-type) in metadata (no-hardcode). |
| Lost dirty signal | Periodic full audit (TTL net) re-validates within TTL — the guaranteed catch-all. |
39.9 No-hardcode / no-island attestation
- No-hardcode: worker names, source registries, watermark columns, batch sizes, coalesce windows are data (
gov_worker_cursor.source_name/metadata), not code arrays. A new tail source = a new cursor row, not a code change. No object-class or axis list anywhere. - No-island: lease + heartbeat reuse the one
queue_heartbeat; retry reuses the oneevent_pending; the cursor family mirrors the one liveiu_route_worker_cursorshape. The only additive object isgov_worker_cursor(a governance-scoped twin, not a parallel worker framework). Jobs stay injob_queue(Đ45 boundary); signals stay inevent_outbox(SB-11). No second queue, no second lock manager, no second heartbeat.
39.10 Acceptance tests (build-time; cannot run now)
- Resumability: kill a worker mid-sweep → restart processes exactly the unprocessed remainder; zero gap, zero double-commit (verified by candidate-state row count + idempotency-key uniqueness).
- Type generality: one
gov_worker_cursorfamily successfully tailsbirth_registry(int id),registry_changelog(int id, non-tz ts), andevent_outbox(uuid id) with correct keyset ordering each. - Lease safety: two concurrent instances of the same worker → exactly one processes; the other backs off; no double-write.
- DLQ-not-drop: inject a poison item → after N retries it is dead-lettered + a finding raised; the sweep completes; the item is visible, not lost.
- Heartbeat: stop a worker →
cursor_silent_gapfires within threshold; restart clears it. - Scale: seed 1,037,724 rows in 2k–5k batches, each read < 5 s; no OFFSET; cursor advances monotonically.
- No-UI-scan: UI/ops reads only summary views; the raw cursor/candidate sweep is never exposed to a UI full-table query.
39.11 Dependencies, gates, verdict
- Designable now: YES (done). Build now: NO.
- Build gates:
gov_worker_cursorDDL = gated (operator, reversible-by-default); writingqueue_heartbeat/event_pendinggovernance rows = part of GCOS build (gated with SB-10/SB-11); running any worker requires thegovernanceevent domain active (SB-11) and the candidate-state store (SB-10). C-7 owns the observer-trigger question (doc 32 §4 Option B) — not a cursor concern (default Option A cursor-tail needs no trigger). - No COMMIT (
os_proposal_approvals=0).
SB-13 design verdict: COMPLETE — GO for build-prep, BUILD NO-GO. Decision = NEW gov_worker_cursor (reuse-shaped, type-generalized watermark) + REUSE queue_heartbeat (lease+heartbeat) + REUSE event_pending (retry/DLQ). The single material correction to the prior GCOS docs: the live iu_route_worker_cursor.last_event_id is uuid, incompatible with the int-keyed birth_registry/registry_changelog; the watermark must be type-generalized (text + typed predicate), and the idempotency key must be (collection_name, entity_code) because canonical_address is universally NULL in birth_registry.
(Cross-refs: doc 31 §4/§6, doc 32 §5/§6, doc 34 §5, doc 35 §3.4 + Branch F #1/#2/#9/#11/#12, doc 38 SB-12, doc 40 SB-10, doc 41 SB-11.)