IU Core 500x — 03 Operator-runtime executor + durable proof
03 — operator_runtime.py: the governed runtime executor
1. What it is
cutter_agent/iu_core/operator_runtime.py — OperatorRuntime turns the
explain-only DOT one-command surface into a governed plan / apply /
verify executor. It drives a dot_commands.DotCommandPlan against a live
IU Core database under the gates, auditing every invocation.
Mirrors route_worker.py / pg_structure_store.py: no IO at import. The
runtime executes SQL through an INJECTED SqlExecutor callable, so it is
fully unit-testable with a fake; the live path wires the psql_executor
factory. No host / container / DB name is hardcoded — psql_executor takes
all four as arguments.
2. Cannot bypass governed SQL — structurally
Two guarantees, not conventions:
- The runtime only ever executes a
DotCommandPlan's statements, and every statementdot_commandsemits is a call to a governedpublic.fn_*. _assert_governedre-checks before any apply: each statement must be aSELECT/DOthat referencespublic.fn_— a raw DML/DDL statement is REFUSED as unsafe (OperatorRuntimeError). Gate #9 is enforced.
3. Gate model — registry-derived, fail-closed
required_gates(command, mode): plan / verify need none; apply needs
iu_core.operator_runtime_enabled plus every gate its target functions
require. GATE_BY_FUNCTION maps each governed function to its dot_config
gate (composer functions → composer_enabled, structure-op functions →
structure_ops_enabled) — the required set is DERIVED from the plan's
targets, never a hardcoded per-command list. A shut required gate REFUSES
before any plan SQL runs, and the refusal is itself audited.
4. The three modes
plan— resolve + live gate snapshot + aplannedledger row. NEVER executes the plan SQL. The explain surface with an audit trail.apply— fail-closed: unknown command / unsafe plan / shut gate REFUSE before execution; an executor fault is captured asfailed. A success isapplied. Every outcome logs a ledger row.verify— a read-only verification:fn_iu_collection_validatewhen acollection_idis supplied, elsefn_iu_collection_healthcheck.
reversal(command, …) surfaces the rollback note + the inverse one-command
(e.g. dot_iu_delete_piece_soft → dot_iu_restore_piece) — never auto-runs.
5. No secret / value logging
params_digest is an md5 over the command + sorted params; the raw values
never enter the ledger SQL. evidence carries only structural facts (mode,
statement count, result-row count, closed gates, error class). Proven by
test_ledger_never_logs_raw_param_values.
6. The psql_executor ssh word-split fix
First cut glued the field separator to -tAF\t; passed through ssh, the
remote shell word-split the tab and psql's -F consumed -v as the
separator. Fix: pass the remote psql command to ssh as ONE string (so the
quoted multi-char separator ~|~ survives the remote shell) and send the SQL
on stdin (binary-safe). Lesson: anything handed to ssh as argv is joined
and re-split by the remote shell — quote separators into a single string.
7. Durable bounded operator-runtime proof
Run against production through the real OperatorRuntime + psql_executor:
| step | call | result | ledger |
|---|---|---|---|
| P1 | plan(dot_iu_healthcheck) |
planned (no execution) |
1 row |
| P2 | apply(dot_iu_healthcheck) (gate open) |
applied |
1 row |
| P3 | verify(dot_iu_healthcheck) |
verified |
1 row |
| P4 | apply(dot_iu_add_piece) (gates shut) |
refused |
1 row |
The operator_runtime_enabled gate was opened only for P1–P3, then closed —
end-state false (inert). Durable footprint: 4 ledger rows, reversible
by DELETE … WHERE actor='runtime_540x_proof'; zero IU / collection /
event / route mutation. v_dot_iu_command_registry afterwards:
dot_iu_healthcheck 3 runs / 1 applied / last_status=verified,
dot_iu_add_piece 1 run / refused. v_dot_iu_command_run_health:
4 total / 2 apply / 1 applied / 1 refused / 0 failed / 2 distinct /
catalog 17.
A first buggy run (the ssh word-split) left 8 ledger rows; they were deleted and the clean proof re-run, so the durable ledger holds exactly the 4 clean rows.
8. Tests
test_iu_core_540x_operator_runtime.py — 53 tests: migration 018 shape +
rollback, runtime/280 catalog ↔ DOT_COMMANDS lock, sandbox 130/140 shape,
runtime/110 at 113, the OperatorRuntime plan/apply/verify/refuse paths,
gate derivation, the unsafe-plan / unknown-command refusals, the
no-value-logging contract, the autocut bridge, text-as-code roundtrip.