03 — Current Cutting Flow ↔ Queue Mapping
03 — Current Cutting Flow ↔ Queue Mapping
Date: 2026-05-26 | Scope: Can the existing MARK→CUT pipeline be represented in the live queue substrate? What is missing for a queue lens?
§1. The current cutting flow (live)
The Information Unit cutting workflow has 6 named operations + cleanup. Each is implemented as an operator alias function wrapping creator primitives:
| Step | Operator alias | Underlying primitive | Effect | Idempotency |
|---|---|---|---|---|
| 1. Pull source | (none — file read) | filesystem | source_kind + source_ref + content_hash recorded later |
implicit |
| 2. MARK | fn_iu_op_mark_file |
fn_iu_mark_create_manifest |
writes 1 row to iu_staging_record (lifecycle pending_review, expires_at = +15d, digest = md5(source_hash‖source_bytes‖pieces::text)) + N rows to iu_staging_payload |
idempotency_key + content_hash |
| 3. VERIFY-MARK | fn_iu_op_verify_mark |
fn_iu_verify_mark |
dry-run / apply both produce verdict=approved with axis checks A/B/C; apply sets approved_at/approved_by/approval_doc_id and moves lifecycle_status → approved |
run_id linked |
| 4. APPROVE (decision) | (implicit in verify-mark apply) | — | gate: approved_at IS NOT NULL AND approval_doc_id IS NOT NULL |
— |
| 5. CUT | fn_iu_op_cut |
fn_iu_cut_from_manifest (G1–G7 guards, SELECT FOR UPDATE) |
materializes pieces into information_unit rows + collection memberships; UPDATEs sort_order/section_type/doc_code/section_code; transitions staging lifecycle_status → consumed |
G5 32-hex format check + G6 source_changed (re-computes source_hash) |
| 6. VERIFY-CUT | fn_iu_op_verify_cut |
fn_iu_verify_cut_result |
three-axis health: A=sort_order, B=section_type, C=parent_or_container_ref; verdict ∈ {verified, problems, no_vector_ok} | run-id linked |
| 7. CLEANUP (15d) | fn_iu_op_cleanup_dry_run (then real) |
fn_iu_staging_cleanup |
3-pass: expire → clean@15d → archive@30d; gated by dot_config.iu_core.retention_enabled (false) |
— |
Each step writes a dot_iu_command_run row (run_id, run_mode ∈ {plan,apply,verify}, run_status, params_digest, gate_snapshot, evidence). Each step emits a staging.* event into event_outbox (event_type_registry rows 23–27: staging.record_created/approved/consumed/rejected/cleaned).
§2. Mapping the flow to "queue states"
A queue lens views each step as a transition between states with a next-action queue tail.
| Cutting state | Substrate location | "Queue tail" — what could pull next | Currently pulled by |
|---|---|---|---|
source_exists |
filesystem | not modeled in PG | n/a |
staged.pending_review |
iu_staging_record.lifecycle_status='pending_review' |
"verify the next pending_review manifest" | operator manually (no scheduler) |
staged.approved |
iu_staging_record.lifecycle_status='approved' |
"cut the next approved manifest" | operator manually |
staged.consumed |
iu_staging_record.lifecycle_status='consumed', consumed_by_run_id set |
"verify-cut the next consumed run" | operator manually |
cut.verified |
dot_iu_command_run.run_status='verified' (for verify-cut) |
"schedule cleanup at +15d" | operator manually |
staged.expired/cleaned |
lifecycle_status='expired'/'cleaned' |
terminal | retention sweep |
Observation: every transition has a clear PG-native state to read from, but no automated puller. The flow is fully operator-driven. A "queue" reading would add the puller — either pg_cron, an external worker daemon, or both.
§3. Event-side mapping
The 5 staging.* event types in event_type_registry are all delivery_lane=delayed, meaning they are intended for batched / debounced delivery via worker:
staging.record_created stream=birth delayed
staging.record_approved stream=update delayed
staging.record_consumed stream=update delayed
staging.record_rejected stream=update delayed
staging.record_cleaned stream=update delayed
Live event_outbox shows 0 staging. events* to date (the top types are all iu/* and system/issue_opened). The hook from staging-mutator → event emission either was never wired in live, or has not fired because the staging volume in this period is from the test/proof macros only.
In the live iu_outbound_default worker cursor, event_domain='iu' (not staging). So even if staging events were emitted, the current worker is not configured to drain them.
Implication: the cutting flow can already write events into the universal substrate, but the read-side (worker → consumer) is currently scoped to iu events, not staging or piece. A queue design needs to widen worker domain scope or add a staging-domain worker.
§4. Can current substrate represent the queue?
| Queue capability | Cutting flow needs | Substrate fit |
|---|---|---|
| Persisting "to do" items | iu_staging_record already does this (lifecycle_status is the work-queue selector) |
✅ |
| Idempotency on enqueue | idempotency_key + content_hash on staging; params_digest on runs |
✅ |
| Retry semantics | dot_iu_command_run.run_status='failed' + re-attempt by re-calling op alias |
⚠️ no auto-retry, no max_attempts |
| Dead-letter | iu_route_dead_letter exists but the cutting flow does not currently target it |
⚠️ would need to widen worker scope |
| Observability cursor | iu_route_worker_cursor schema is reusable |
⚠️ no cursor row exists for cutting domains |
| Scheduling / cadence | none — operator-driven | ❌ no scheduler |
| Lease serialization | dot_iu_runtime_lease exists |
✅ |
| Lifecycle audit | iu_lifecycle_log (for IU lifecycle) + dot_iu_command_run (for run ledger) |
✅ |
| No-vector for transient payload | iu_staging_record.vector_excluded=true + 4-layer NVSZ guarantee |
✅ |
Verdict: the cutting flow can be represented inside the live substrate with minimal additions:
- A worker cursor row for
(worker_name='iu_cutting_default', event_domain='staging')so that staging.* events get drained. - A consumer function (analogous to
fn_iu_auto_instantiate_from_event) that watchesiu_staging_record.lifecycle_statustransitions and emits/routes events. - An invocation path (external trigger or pg_cron) to actually drive the worker tick.
No new tables needed. The current substrate is necessary and (almost) sufficient for the cutting flow once given a puller.
§5. If we widen to "system-wide queue" (not just cutting)
The cutting flow is a single use case. The mission lists 11 others (tasks, messages, trigger-in/out, worker jobs, Agent/Hermes jobs, DOT jobs, review/approval, vector sync, staging cleanup, MOT-generated workflows, IU two-way trigger automation).
Mapping each:
| Use case | Best fit on current substrate | Gap |
|---|---|---|
| Tasks (per-actor todos) | event_outbox stream='task' |
no task lifecycle (claim/start/done); subscription-only |
| Messages (board-style) | event_outbox stream='comment' / 'review' |
exists for IU; not generalized to other actors |
| Trigger-IN (PG ⇒ event) | iu_sql_event_route |
only 1 row, dry-run; vocab target_event_domain limited to {iu, iu_sql} — needs widening |
| Trigger-OUT (event ⇒ action) | fn_iu_auto_instantiate_from_event is the prototype |
bespoke per consumer; no generic dispatch table |
| Worker jobs (long-running) | none | event_pending is for events, not arbitrary worker payloads |
| Agent/Hermes jobs | none in PG today; runs are tracked externally | needs a job_outbox or piggyback domain |
| DOT jobs | dot_iu_command_run is a ledger, not a queue |
DOT runs are pulled by external operator/agent |
| Review/approval jobs | event_outbox stream='review' |
exists for IU drafts; not generalized |
| Vector sync jobs | LISTEN/NOTIFY kb_vector_sync + external daemon |
independent pipeline; not unified |
| Staging cleanup jobs | fn_iu_staging_cleanup + retention_enabled flag |
manual / external invocation; no scheduler |
| MOT-generated workflows | none | depends on MOT spec — out of scope here |
| IU two-way trigger automation | iu_sql_event_route + consumer fns |
substrate ready, very few routes registered |
Pattern: for event-shaped work, the substrate covers it well. For long-running job-shaped work (compute, build, ingest, scheduled sweep), there is no PG-native substrate today — those flow through external workers (Hermes, Codex, Directus, cron on the host). A future queue law must decide whether to:
- (a) widen
event_outboxsemantics to carry job-style payloads with a job-style domain (event_domain='job'?), or - (b) introduce a parallel
job_outboxtable with claim/start/done lifecycle, or - (c) stay event-only and require all external workers to register at least an event trail.
Each has tradeoffs. Survey makes no recommendation; this belongs in the design pack.
§6. Summary
- Cutting flow fits the existing substrate with minor additive changes (worker cursor + consumer + puller).
- System-wide queue ambitions (especially long-running jobs, Agent jobs, scheduled sweeps) require either a parallel job substrate or a deliberate decision to keep all work event-shaped.
- The "queue" terminology in the mission needs disambiguation: the current system has a governance-event store with worker drain, not a generic job queue. These are similar but not identical patterns.