KB-7B04

03 — Migration 050 Phase 1 Substrate (authored + applied)

10 min read Revision 1
dieu-45phase-1migration050job-queuejob-dead-letterqueue-heartbeatapplied2026-05-26

03 — Migration 050 Phase 1 Substrate

File: /root/dieu45-phase1/050-dieu45-phase1-minimal-job-substrate.sql (679 lines) Applied: single TX, ON_ERROR_STOP=1, COMMITTED 2026-05-26 11:14 UTC. Live-schema-wins drift adjustments: none required — dot_config shape and CHECK conventions matched pre-survey.

Schema delivered

Table 1: public.job_queue

Column Type Notes
job_id uuid PK DEFAULT gen_random_uuid()
job_kind text NOT NULL CHECK btrim<>''
state text NOT NULL DEFAULT 'queued' 9-state CHECK
source_ref text signal-only
target_ref text signal-only
payload_ref text signal-only
payload_json jsonb NOT NULL DEFAULT '{}' denylist CHECK
priority integer NOT NULL DEFAULT 100 nonneg
scheduled_at timestamptz NOT NULL DEFAULT now() retry/backoff target
picked_at timestamptz set by fn_job_claim
finished_at timestamptz required when terminal
lease_owner text paired with lease_until
lease_until timestamptz paired with lease_owner
attempts integer NOT NULL DEFAULT 0 nonneg
max_attempts integer NOT NULL DEFAULT 5 >=1
idempotency_key text NOT NULL partial-unique on active states
actor text NOT NULL CHECK btrim<>''
run_id uuid optional dot_iu_command_run link
last_error text
created_at timestamptz NOT NULL DEFAULT now()
updated_at timestamptz NOT NULL DEFAULT now() trigger-refreshed

9-state CHECK (job_queue_state_check):

state = ANY (ARRAY[
    'queued','leased','in_progress','succeeded','failed',
    'retry_waiting','dead_letter','cancelled','cleaned'
])

Payload denylist CHECK (job_queue_payload_safe_check) mirrors event_outbox.safe_payload_check verbatim with 10 keys: body, content, raw, vector, embedding, secret, token, password, ssn, personal_data.

Other CHECKs:

  • attempts >= 0
  • max_attempts >= 1
  • priority >= 0
  • lease consistency: both lease_owner and lease_until null OR both set
  • terminal states must have finished_at set

Indexes:

  • job_queue_idempotency_active_uqpartial UNIQUE on idempotency_key WHERE state IN ('queued','leased','in_progress','retry_waiting'). Allows replay after succeeded/dead_letter/cancelled/cleaned.
  • job_queue_state_scheduled_idx(state, scheduled_at, priority) WHERE state IN ('queued','retry_waiting') — used by fn_job_claim.
  • job_queue_lease_until_idx(lease_until) WHERE state = 'leased' — used by Phase 2 lease reaper.
  • job_queue_job_kind_state_idx(job_kind, state) — used by views.
  • job_queue_created_at_idx(created_at DESC) — used by retention.

Table 2: public.job_dead_letter

Separate from hot queue per DP3. No FK to job_queue so DLQ survives Phase-7 retention sweep.

Column Type Notes
dead_letter_id uuid PK
job_id uuid NOT NULL unique row, not FK
job_kind, source_ref, target_ref, payload_ref, payload_json copied from origin
final_error text NOT NULL CHECK btrim<>''
attempts integer NOT NULL nonneg
actor, run_id, idempotency_key copied
moved_at timestamptz NOT NULL DEFAULT now()
moved_by text NOT NULL CHECK btrim<>''
triage_status text NOT NULL DEFAULT 'pending' CHECK enum
triage_note, triaged_at, triaged_by nullable, set on triage

Triage CHECK: pending, acknowledged, manual_replay, escalated, closed.

payload_json CHECK identical 10-key denylist.

Table 3: public.queue_heartbeat

Column Type Notes
executor_name text PK CHECK btrim<>''
executor_kind text NOT NULL CHECK §11.5 7-name set
last_tick_at timestamptz NOT NULL DEFAULT now()
last_tick_status text NOT NULL DEFAULT 'ok' CHECK ok/warn/error
ticks_total bigint NOT NULL DEFAULT 0 UPSERT increments
current_job_id uuid optional
lease_owner text optional
metadata jsonb NOT NULL DEFAULT '{}' denylist CHECK
created_at, updated_at timestamptz NOT NULL trigger-refreshed

executor_kind CHECK (Điều 45 §11.5 verbatim):

DOT, Agent, Hermes, Codex, PG_worker, external_worker, future_Kestra_adapter

metadata CHECK identical 10-key denylist.

Views

  1. v_queue_health — combines per-executor queue_heartbeat rows (with fresh|warning|stale classification using queue.heartbeat.stale_threshold_seconds and 2×) with job_queue state distribution and job_dead_letter triage backlog. 25 columns total including 4 gate-value mirrors.
  2. v_job_queue_backlog(job_kind, state) rollup with row_count, oldest/newest created_at, next scheduled, avg/max attempts.
  3. v_job_dead_letter_summary(job_kind, triage_status) rollup.

Functions (8 new)

Function Security Returns Gate
fn_job_enqueue (11 params) DEFINER jsonb queue.job_substrate.enabled
fn_job_claim (3 params) DEFINER jsonb queue.job_substrate.enabled
fn_job_ack (3 params) DEFINER jsonb queue.job_substrate.enabled
fn_job_move_to_dead_letter (3 params) DEFINER jsonb implicit via callers
fn_job_fail_or_retry (4 params) DEFINER jsonb queue.job_substrate.enabled
fn_queue_heartbeat_tick (6 params) DEFINER jsonb queue.heartbeat.enabled
fn_queue_stale_check (1 param) INVOKER STABLE jsonb none (read-only, passive)
fn_job_queue_updated_at (trigger fn) INVOKER trigger n/a

Triggers

  • tr_job_queue_updated_at BEFORE UPDATE ON job_queuefn_job_queue_updated_at()
  • tr_queue_heartbeat_updated_at BEFORE UPDATE ON queue_heartbeat → same fn

(TRIGGER-GUARD event-trigger emitted advisory WARNINGs at CREATE TRIGGER — non-blocking, expected, mirrors prior project pattern.)

dot_config keys inserted (8)

Key Value Role
queue.job_substrate.enabled false master gate for fn_job_*
queue.worker.enabled false reserved for Phase 2 worker loop
queue.notify.enabled false reserved for Phase 4 NOTIFY bridge
queue.heartbeat.enabled false gate for fn_queue_heartbeat_tick
queue.heartbeat.stale_threshold_seconds 300 tuning
queue.retry.max_attempts_default 5 tuning
queue.retry.backoff_base_sec 10 tuning
queue.lease.duration_sec 300 tuning

ON CONFLICT (key) DO NOTHING keeps the migration idempotent.

Idempotency-key uniqueness — partial-unique design

CREATE UNIQUE INDEX job_queue_idempotency_active_uq
    ON public.job_queue (idempotency_key)
    WHERE state IN ('queued','leased','in_progress','retry_waiting');

Consequence:

  • Concurrent enqueue with same key → exactly one row; second call returns duplicate=true after pre-check (verified in proof step 4).
  • After succeeded/dead_letter/cancelled/cleaned → same key can be re-enqueued, creating a new job_id.

Retry backoff formula

backoff_sec = queue.retry.backoff_base_sec * (1 << LEAST(attempts - 1, 10))
scheduled_at = now() + backoff_sec

With base=10 and cap 2^10:

  • attempt 1 → 10 s
  • attempt 2 → 20 s
  • attempt 3 → 40 s
  • attempt 4 → 80 s
  • attempt 5 → 160 s
  • … attempt 10 → 5120 s
  • attempt 11+ → 10240 s (capped)

Apply log (verbatim tail)

BEGIN
SET
SET
CREATE TABLE              -- job_queue
CREATE INDEX (×5)
CREATE TABLE              -- job_dead_letter
CREATE INDEX (×3)
CREATE TABLE              -- queue_heartbeat
CREATE INDEX (×1)
INSERT 0 8                -- 8 dot_config keys
CREATE FUNCTION (×7)
CREATE VIEW (×3)
CREATE FUNCTION           -- fn_job_queue_updated_at
NOTICE: trigger ... does not exist, skipping
DROP TRIGGER
WARNING:  [TRIGGER-GUARD] DDL detected: CREATE TRIGGER ...
CREATE TRIGGER            -- tr_job_queue_updated_at
NOTICE: trigger ... does not exist, skipping
DROP TRIGGER
WARNING:  [TRIGGER-GUARD] DDL detected: CREATE TRIGGER ...
CREATE TRIGGER            -- tr_queue_heartbeat_updated_at
COMMIT

Read-back delta

Surface Pre Post Δ
Tables 268 271 +3 ✓
Views 52 55 +3 ✓
Functions 507 515 +8 ✓
dot_config 86 94 +8 ✓
pg_dump -Fc 82,775,116 B 82,809,926 B +34,810 B
job_queue rows n/a 0 inert
job_dead_letter rows n/a 0 inert
queue_heartbeat rows n/a 0 inert

DOT catalog

Deferred to Phase 2 carry-forward per the apply-prompt allowance: "If DOT governance requires additional birth/review artifacts, either satisfy them or defer DOT catalog to carry-forward." dot_iu_command_catalog columns (command_name, category, mutating, reversible, target_functions, registered_at) are incompatible with the executor/job_kind shape proposed by DP6 consumer_registry (Phase 4). Inserting Phase-1 DOT entries now would either lock in a shape DP6 must rewrite, or use the dot_iu_* shape and conflate IU-CUT semantics. Both unsafe. Deferred.

Back to Knowledge Hub knowledge/dev/laws/dieu44-trien-khai/v0.6-dieu45-phase-1-minimal-job-substrate-live-apply/03-migration-phase-1-substrate.md