KB-1B34

Dieu 45 Phase 2 — Heartbeat Activation + Lease/DLQ Governance LIVE APPLY (PASS)

8 min read Revision 1
dieu45phase2heartbeatlease-governancedlq-governancelive-applypass2026-05-26

Điều 45 Phase 2 — Heartbeat Activation + Lease + DLQ Governance (LIVE APPLY PASS)

  • Status: DIEU45_PHASE_2_HEARTBEAT_ACTIVATION_LEASE_GOVERNANCE_PASS
  • Date: 2026-05-26
  • Channel: SSH contabo → docker exec postgrespsql workflow_admin@directus (PostgreSQL 16.13)
  • Migration: dieu45_phase2_mig_051.sql (518 lines, single transaction COMMITTED)
  • Backups: pre = 82,970,890 B → post = 83,187,232 B (+216,342 B)
  • Parent phase: Phase 1 (mig 050) — …/v0.6-dieu45-phase-1-minimal-job-substrate-live-apply/

Mission scope honored

  • Heartbeat governance activated (queue.heartbeat.enabled=true); silent gap of legacy iu_outbound_default worker now surfaces in v_queue_health and fn_queue_stale_check.
  • Lease governance shipped: dry-run + double-gated apply reaper.
  • DLQ governance shipped: dry-run requeue preview + triage update.
  • No worker started. queue.worker.enabled=false, queue.job_substrate.enabled=false.
  • No pg_cron. pg_extension set unchanged: {btree_gist, pgcrypto, plpgsql, postgres_fdw}.
  • No CHECK widening. queue_heartbeat.executor_kind keeps the 7-value MOT-excluded vocab (§11.5).
  • No event_outbox schema mutation. 9 CHECK constraints unchanged.
  • No MARK/CUT alias touch. fn_iu_op_* 5 aliases unchanged.
  • No production_documents. Absent throughout.

What was added (durable)

dot_config keys (+4 new, 1 flipped)

Key Value Purpose
queue.lease.reaper_enabled false Master gate for fn_job_reap_stale_leases_apply
queue.lease.reaper_dry_run_only true Safety gate; refuses mutation even with reaper_enabled=true
queue.dlq.replay_enabled false Master gate for any DLQ replay mutator
queue.runtime.phase phase2_governance Phase marker
queue.heartbeat.enabled falsetrue Heartbeat protocol armed (no worker started)

Total queue.* keys: 8 → 12.

queue_heartbeat (+1 passive marker row)

  • executor_name='iu_outbound_default', executor_kind='PG_worker'
  • last_tick_at='2026-05-22 11:31:41.928657+00' (bound to real iu_route_worker_cursor.last_run_at)
  • last_tick_status='warn', ticks_total=0
  • metadata: marker=legacy_silent_passive, registered_by Phase 2 mig 051, explicit not-an-active-claim note

This is a passive marker, not a heartbeat. It exists so fn_queue_stale_check and v_queue_health.executors_stale deterministically surface the §15.5 silent gap.

Functions (+5 new)

  1. fn_queue_heartbeat_register_passive(executor_name, executor_kind, last_observed_at, actor, note) — SECURITY DEFINER, gated by queue.heartbeat.enabled, idempotent insert only.
  2. fn_job_reap_stale_leases_dry_run(limit) — STABLE INVOKER, lists stale-leased jobs with would_action preview.
  3. fn_job_reap_stale_leases_apply(actor, limit) — SECURITY DEFINER, triple-gated (queue.job_substrate.enabled AND queue.lease.reaper_enabled AND NOT queue.lease.reaper_dry_run_only), FOR UPDATE SKIP LOCKED, resets to retry_waiting with backoff or routes to DLQ at max_attempts.
  4. fn_job_dead_letter_requeue_dry_run(dead_letter_id) — STABLE, surfaces replay_gate_enabled + idempotency_collision + would_action.
  5. fn_job_dead_letter_triage_update(dead_letter_id, triage_status, actor, triage_note) — SECURITY DEFINER, vocab pinned to existing CHECK {pending, acknowledged, manual_replay, escalated, closed}. Does NOT requeue.

Total fn_job_* + fn_queue_* (excluding the trigger fn): 7 → 12.

Silent-gap status (closes the §15.5 risk visibly)

Before:

  • iu_outbound_default.last_run_at = 2026-05-22 11:31:41 (silent ~96h)
  • Not visible in v_queue_health or fn_queue_stale_check because the row did not exist in queue_heartbeat.

After:

  • Same last_run_at, now mirrored as a passive queue_heartbeat row.
  • v_queue_health.executors_stale = 1.
  • fn_queue_stale_check() returns stale_count=1 with age_seconds≈354,193 (~98h), executor_name=iu_outbound_default.
  • last_tick_status='warn' so the row carries an explicit non-healthy hint.
  • The worker is not claimed as active; the marker is explicitly passive.

Bounded proof — 11/11 PASS in BEGIN/ROLLBACK

See 05-bounded-proof-results.md. Highlights:

  • Reset path: 1 stale-leased job reset to retry_waiting with backoff_sec=10, attempts 0→1.
  • DLQ path: 1 stale-leased job (attempts=4 pre-loaded) moved to job_dead_letter at attempts=5.
  • Triage update: pending → acknowledged.
  • DLQ requeue dry-run: would_action = "refused: queue.dlq.replay_enabled=false".
  • Ack path: a fresh-lease job acked → succeeded.
  • Heartbeat tick: armed protocol accepts ticks (test executor phase2_proof_executor, ticks_total=1).
  • Stale check: 1 fresh + 1 stale classified correctly.
  • 7 refusals verified: already_registered, invalid_triage_status, actor_required, dry_run_only gate, reaper_enabled gate, not_found, payload denylist {vector} blocked.
  • ROLLBACK restored: job_queue=0, job_dead_letter=0, queue_heartbeat=1 (passive marker survives), gates restored.

Regression — PASS

Surface Phase 1 post Phase 2 post Δ
tables 271 271 0
views 55 55 0
fns 515 520 +5 (Phase 2 deliverables)
dot_config keys 94 98 +4 (Phase 2 deliverables)
queue.* keys 8 12 +4
queue_heartbeat rows 0 1 +1 (passive marker)
job_queue rows 0 0 0
job_dead_letter rows 0 0 0
event_outbox rows 133,784 (baseline) 134,803 +1,019 (background only; 0 with phase2 actor_ref or canonical_address)
iu_route_worker_cursor.last_run_at 2026-05-22 11:31:41 2026-05-22 11:31:41 unchanged (still silent)
iu_vector_sync_point 152 152 0
information_unit 175 175 0
pg_extensions btree_gist/pgcrypto/plpgsql/postgres_fdw unchanged 0
production_documents absent absent unchanged
fn_iu_op_* aliases 5 5 unchanged
event_outbox CHECKs 9 9 unchanged

Forbidden list — all 14 honored

  • No pg_cron install — confirmed
  • No worker/daemon/cron started — confirmed
  • No event_outbox row from phase2 — confirmed (0 rows with phase2* markers)
  • No event_outbox schema change — confirmed (9 CHECKs intact)
  • No event_type_registry / iu_sql_event_route CHECK widening — confirmed
  • No event_outbox row mutation — confirmed
  • No Qdrant touch — confirmed (no MCP/SQL Qdrant call)
  • No production_documents touch — confirmed (absent throughout)
  • No MARK/CUT alias touch — confirmed (5 fn_iu_op_* intact)
  • No START-HERE change — confirmed (no file edits this run)
  • No Điều 45 text change — confirmed
  • No customer/email/MOT runtime — confirmed
  • No queue.worker.enabled=true flip — confirmed (kept false)
  • No queue.job_substrate.enabled=true durable flip — confirmed (kept false; only temporarily flipped inside the rolled-back proof TX)

Next phase

See 08-next-phase-recommendation.md. Recommended pack: DIEU45_PHASE_3_ROUTE_PILOT_DRY_RUN_AND_REAL_HEARTBEAT_CALLER — wire an external operator/Hermes ping that calls fn_queue_heartbeat_tick (closing the gap durably), and stand up a route-pilot dry-run for a single low-risk job_kind in dot_iu_route_worker_run's seam.

  • Parent enacted law: [[project-dieu45-v1-0-enacted-2026-05-26]]
  • Phase 1 substrate: [[project-dieu45-phase1-minimal-job-substrate-live-apply-pass-2026-05-26]]
  • Design pack: [[project-dieu45-full-queue-orchestration-design-pack-dp1-to-dp7-pass-2026-05-26]]
  • Silent-gap lesson: [[feedback-dieu45-silent-gap-violation-post-enactment]]
  • Heartbeat pattern lesson: [[feedback-hc-executor-last-run-is-proven-heartbeat-pattern]]
Back to Knowledge Hub knowledge/dev/laws/dieu44-trien-khai/v0.6-dieu45-phase-2-heartbeat-activation-lease-governance/00-summary.md