KB-9553 rev 2

IU Core 500x — 03 Operator-runtime executor + durable proof

5 min read Revision 2
dieu44iu-core-mvp500xoperator-runtimeoperator-runtime-pydurable-proofv0.62026-05-22

03 — operator_runtime.py: the governed runtime executor

1. What it is

cutter_agent/iu_core/operator_runtime.pyOperatorRuntime turns the explain-only DOT one-command surface into a governed plan / apply / verify executor. It drives a dot_commands.DotCommandPlan against a live IU Core database under the gates, auditing every invocation.

Mirrors route_worker.py / pg_structure_store.py: no IO at import. The runtime executes SQL through an INJECTED SqlExecutor callable, so it is fully unit-testable with a fake; the live path wires the psql_executor factory. No host / container / DB name is hardcoded — psql_executor takes all four as arguments.

2. Cannot bypass governed SQL — structurally

Two guarantees, not conventions:

  1. The runtime only ever executes a DotCommandPlan's statements, and every statement dot_commands emits is a call to a governed public.fn_*.
  2. _assert_governed re-checks before any apply: each statement must be a SELECT / DO that references public.fn_ — a raw DML/DDL statement is REFUSED as unsafe (OperatorRuntimeError). Gate #9 is enforced.

3. Gate model — registry-derived, fail-closed

required_gates(command, mode): plan / verify need none; apply needs iu_core.operator_runtime_enabled plus every gate its target functions require. GATE_BY_FUNCTION maps each governed function to its dot_config gate (composer functions → composer_enabled, structure-op functions → structure_ops_enabled) — the required set is DERIVED from the plan's targets, never a hardcoded per-command list. A shut required gate REFUSES before any plan SQL runs, and the refusal is itself audited.

4. The three modes

  • plan — resolve + live gate snapshot + a planned ledger row. NEVER executes the plan SQL. The explain surface with an audit trail.
  • apply — fail-closed: unknown command / unsafe plan / shut gate REFUSE before execution; an executor fault is captured as failed. A success is applied. Every outcome logs a ledger row.
  • verify — a read-only verification: fn_iu_collection_validate when a collection_id is supplied, else fn_iu_collection_healthcheck.

reversal(command, …) surfaces the rollback note + the inverse one-command (e.g. dot_iu_delete_piece_softdot_iu_restore_piece) — never auto-runs.

5. No secret / value logging

params_digest is an md5 over the command + sorted params; the raw values never enter the ledger SQL. evidence carries only structural facts (mode, statement count, result-row count, closed gates, error class). Proven by test_ledger_never_logs_raw_param_values.

6. The psql_executor ssh word-split fix

First cut glued the field separator to -tAF\t; passed through ssh, the remote shell word-split the tab and psql's -F consumed -v as the separator. Fix: pass the remote psql command to ssh as ONE string (so the quoted multi-char separator ~|~ survives the remote shell) and send the SQL on stdin (binary-safe). Lesson: anything handed to ssh as argv is joined and re-split by the remote shell — quote separators into a single string.

7. Durable bounded operator-runtime proof

Run against production through the real OperatorRuntime + psql_executor:

step call result ledger
P1 plan(dot_iu_healthcheck) planned (no execution) 1 row
P2 apply(dot_iu_healthcheck) (gate open) applied 1 row
P3 verify(dot_iu_healthcheck) verified 1 row
P4 apply(dot_iu_add_piece) (gates shut) refused 1 row

The operator_runtime_enabled gate was opened only for P1–P3, then closed — end-state false (inert). Durable footprint: 4 ledger rows, reversible by DELETE … WHERE actor='runtime_540x_proof'; zero IU / collection / event / route mutation. v_dot_iu_command_registry afterwards: dot_iu_healthcheck 3 runs / 1 applied / last_status=verified, dot_iu_add_piece 1 run / refused. v_dot_iu_command_run_health: 4 total / 2 apply / 1 applied / 1 refused / 0 failed / 2 distinct / catalog 17.

A first buggy run (the ssh word-split) left 8 ledger rows; they were deleted and the clean proof re-run, so the durable ledger holds exactly the 4 clean rows.

8. Tests

test_iu_core_540x_operator_runtime.py — 53 tests: migration 018 shape + rollback, runtime/280 catalog ↔ DOT_COMMANDS lock, sandbox 130/140 shape, runtime/110 at 113, the OperatorRuntime plan/apply/verify/refuse paths, gate derivation, the unsafe-plan / unknown-command refusals, the no-value-logging contract, the autocut bridge, text-as-code roundtrip.

Back to Knowledge Hub knowledge/dev/laws/dieu44-trien-khai/v0.6-iu-core-500x-integrated-autocut-operator-runtime-open-goal/03-operator-runtime-executor.md