KB-2C50

14 — Implementation Roadmap (Phase 1–7)

10 min read Revision 1
design-packdieu-45roadmapphase-1phase-7implementation-plandesign-only

14 — Implementation Roadmap (Phase 1–7)

DESIGN-ONLY. No phase implemented here. Each phase is a future pack with its own Hard Gate + Council ratification. Phase order is chosen for safety, not delivery speed.


§1. Goal

Sequence the work introduced by this design pack into seven phases, each:

  • Independently reversible (rollback story).
  • Independently observable (v_queue_health extension).
  • Independently gateable (dot_config flags).
  • Each phase opens a small additive surface; later phases consume earlier substrate.

Phase order matches §18.4 sub-design pack naming where possible (DP1–DP7) and additionally introduces three integration phases.


§2. Phases at a glance

Phase Theme Substrate change Risk DP source
Phase 1 Minimal job substrate job_queue + job_dead_letter + gated fns (no-op when disabled) Low — additive DP2
Phase 2 Retry / lease / DLQ wired dot_config queue.retry.*, lease fn integration Low — config-driven DP3
Phase 3 Worker / NOTIFY / heartbeat queue_heartbeat + stale-check + optional NOTIFY bridge Medium — closes §15.5 silent gap DP4, DP1
Phase 4 Trigger-in / Trigger-out generalisation consumer_registry + widened iu_sql_event_route CHECK Medium — CHECK change needs ratification DP5
Phase 5 MARK/CUT queue pilot First puller-enabled collection; staging.* event consumers Medium — first live consumer flow doc 10
Phase 6 MOT pilot job_workflow + fn_mot_graph_emit + first MOT template High — MOT spec dependency doc 11
Phase 7 Customer / email / message pilot First customer_message_* job kinds + channel adapter High — PII boundary, external SMTP/IMAP doc 12

§3. Phase 1 — Minimal job substrate

§3.1 Scope

  • Create job_queue, job_dead_letter, job_workflow tables (job_workflow may slip to Phase 6).
  • Create all gated fns (fn_job_*) with disable-flag short-circuit.
  • Create new dot_config keys: queue.job_substrate.enabled=false, queue.retry.* defaults, queue.lease_reaper.enabled=false.
  • Create views: v_job_queue_backlog, v_job_queue_in_progress, v_job_queue_dead_letter_open, v_dead_letter_all.

§3.2 Out-of-scope at Phase 1

  • No consumer_registry (Phase 4).
  • No queue_heartbeat (Phase 3).
  • No job_subscription (Phase 3 or later).
  • No event_outbox change.
  • No first job actually enqueued.

§3.3 Verification gate

  • Functions exist; calling enqueue with queue.job_substrate.enabled=false returns {refused: true, reason: 'job_substrate_disabled'}.
  • v_job_queue_backlog returns 0 rows.
  • D9 KB report shows new tables + fns count.
  • Regression: existing event_outbox queries unchanged.

§3.4 Rollback

  • Drop new tables (no rows yet).
  • Remove new fns.
  • Remove new dot_config keys.

§4. Phase 2 — Retry / lease / DLQ wired

§4.1 Scope

  • Implement DP3 retry/backoff inside fn_job_fail_transient (no behavioural change yet; flag-gated).
  • Implement fn_job_lease_reaper as a callable fn.
  • Implement fn_job_dead_letter_replay.
  • Council ratifies concrete max_attempts numbers per dot_config.queue.retry.max_attempts.<job_kind>.

§4.2 Verification gate

  • Test a synthetic transient failure: lease expires → reaper rescues → row re-claimable.
  • Test permanent failure path: row in job_dead_letter.
  • Test replay: row back in job_queue with attempts=0.

§5. Phase 3 — Heartbeat + NOTIFY bridge + DP1 cadence

§5.1 Scope

  • Create queue_heartbeat table + register/write/stale-check fns (DP4).
  • Create v_queue_health aggregating cursor + heartbeat + DLQ + backlog.
  • Register existing iu_outbound_default worker as first queue_heartbeat row (its existing last_run_at migrates to a new queue_heartbeat.last_beat_at on every tick).
  • Register system/queue_worker_silent in event_type_registry (Council vocab gate).
  • Optionally: enable NOTIFY bridge (DP1 Layer 2) per queue.notify.bridge_enabled (default false).
  • Document external cadence: one named orchestrator owns ticks (Hermes, host cron, dedicated container).

§5.2 Critical milestone

Close the 4-day silent gap. Within Phase 3 completion, iu_outbound_default should have a heartbeat fresher than stale_threshold (default 300s).

§5.3 Verification gate

  • v_queue_health row for iu_outbound_default shows status_hint='fresh'.
  • A simulated 10× cadence skip emits system/queue_worker_silent.
  • D31 watchdog / D43 red_zones receive the event.

§6. Phase 4 — Trigger-in / Trigger-out generalisation

§6.1 Scope

  • Widen iu_sql_event_route.target_event_domain CHECK to §6.1 9-domain set (Council vocab gate).
  • Create consumer_registry + fn_consumer_dispatch (DP5).
  • Migrate fn_iu_auto_instantiate_from_event into a consumer_registry row (callable from dispatch).
  • Create job_subscription table + executor binding (DP6).

§6.2 Verification gate

  • One existing event_type (e.g. iu.template.instance_auto_composed) is consumed via the new dispatch path, producing identical effect to legacy.
  • DLQ remains empty.

§7. Phase 5 — MARK/CUT queue pilot

§7.1 Scope

  • Pick one collection (probably a low-volume governance one) as the puller-enabled pilot.
  • Add consumer_registry rows for staging.record_approved → cut, staging.record_consumed → verify_cut, staging.record_cleaned → (terminal).
  • Worker (Agent/Codex/DOT) claims the resulting job_queue rows and calls fn_iu_op_* aliases unchanged.
  • Operator continues to drive MARK and APPROVE; CUT and CLEANUP automate.

§7.2 Verification gate

  • One end-to-end MARK → CUT → CLEANUP cycle in the pilot collection runs without operator touching CUT.
  • All status transitions visible in v_queue_health + v_iu_staging_record.

§8. Phase 6 — MOT pilot

§8.1 Scope

  • Implement job_workflow (if deferred from Phase 1).
  • Implement fn_mot_graph_emit.
  • Ship one MOT template — likely the cutting pipeline expressed as a MOT graph (so MARK/CUT becomes a 6-step MOT workflow).
  • Add fn_job_workflow_refresh as a tick job.

§8.2 Verification gate

  • One MOT-generated workflow visible in v_job_workflow_health advancing through active → succeeded.
  • No MOT value appears in any executor field — enforcement intact.

§9. Phase 7 — Customer / email / message pilot

§9.1 Scope

  • Open the customer message store sub-design pack (separate macro).
  • Implement first channel adapter (likely IMAP→PG read + SMTP send via external_worker).
  • Register customer.* event_type values in registry.
  • Register customer_message_* job_kinds in consumer_registry.
  • Enforce approve-required flag (default on).

§9.2 Verification gate

  • One round-trip customer interaction (inbound → classify → draft → approve → send) processed through queue.
  • PII boundary preserved (no body in safe_payload).
  • DLQ empty for customer_message_send.

§10. Cross-phase dependencies

Phase 1 (job_queue)
    │
    ├──→ Phase 2 (retry/lease) ──→ Phase 5 (cut pilot)
    │                                  ▲
    │                                  │
    └──→ Phase 3 (heartbeat) ──────────┴─→ Phase 4 (dispatch) ──→ Phase 6 (MOT) ──→ Phase 7 (customer)
                │                                                                       ▲
                └─→ closes §15.5 silent gap                                            (depends on §13.2)

Critical path order: Phase 1 → Phase 3 (heartbeat fix is high-value because §15.5 is already violated). Phases 2, 4 may parallelise after Phase 1.


§11. Per-phase governance

Each phase requires:

  1. A separate prompt-mục-tiêu-mở macro (per Open Goal Prompt Guide v1.2).
  2. Council Round 1 review (GPT) of the design.
  3. Optional Council Round 2 (Gemini).
  4. User approval.
  5. Migration ratification + dry-run + apply.
  6. Post-apply verification gate.
  7. Memory update + report upload.

No phase combines DDL ratification with DML or with dot_config mutation in a single migration — Đimu 35/44 discipline.


§12. Out-of-scope at every phase

  • pg_cron installation (§5.4 — requires its own amendment).
  • PG 18 upgrade (separate readiness macro).
  • Any change to Điều 45 v1.0 substance (requires amendment process per §18.1).
  • Any change to existing event_outbox, iu_core.iu_staging_* schema.
  • Qdrant / vector tier mutation.

§13. Open questions

# Question Routed to
RM-Q1 Phase order: should Phase 3 (heartbeat) come before Phase 1 (job_queue) to fix silent gap faster? Council
RM-Q2 Combine Phase 1+2 into one migration (smaller blast radius) or separate (cleaner gates)? Council
RM-Q3 Phase 5 pilot collection — which one? Council + operator
RM-Q4 Phase 6 MOT template — should it be the cut pipeline, or a new low-stakes workflow? Council
RM-Q5 Approve the 7-phase cadence (vs collapsing into fewer)? Council

Implementation roadmap. No phase executed. Authored 2026-05-26.

Back to Knowledge Hub knowledge/dev/laws/dieu44-trien-khai/v0.6-dieu45-full-queue-orchestration-design-pack/14-implementation-roadmap.md