Automation Orchestrator Design · 02 End-to-End State Machine
Automation Orchestrator Design · 02 End-to-End State Machine
doc 2 of 7 · 2026-05-20 · design-only macro
phase : G2 — state machine + persistence + resume outcome : G2 PASS — 14 states, 3 sovereign-gate transitions, single-file JSON sidecar persistence production_mutation : NONE
1. State enumeration
states:
- pending # created; nothing run
- source_pinned # manifest digest + region sha captured
- marked # MARK phase done; region rehash matched
- cutplan_ok # cut-plan rebuild reproduces writer_digest
- pre_write_backup_taken # GPG backup persisted + sidecar sha pinned
- grants_probed # GRANT/REVOKE state surveyed; no apply
- awaiting_cut_authorization # SOVEREIGN GATE 1 (paused)
- cut_leg_a_committed # IU + UV + anchor rows written
- structural_verified # 11-bool probe PASS, cardinality matches
- leg_b_recorded # cutter_governance rows persisted
- write_verified # verify_result + verifier signature persisted
- awaiting_lifecycle_authorization # SOVEREIGN GATE 2 (paused)
- lifecycle_enacted # fn_iu_enact called for every IU
- closeout_reported # final KB report uploaded
terminal_states:
- failed_<gate> # any internal gate raised → terminal
- stopped_<reason> # sovereign STOP route → resumable
- voided_<reason> # sovereign-voided run (idempotency cleared)
2. Transition table
Each non-terminal state has exactly one forward transition. Naming:
internal_gate(IG_*) auto-passes (with KB receipt); sovereign_gate(SG_*)
pauses and waits.
transitions:
pending → source_pinned : IG_source_pin
source_pinned → marked : IG_mark
marked → cutplan_ok : IG_cutplan
cutplan_ok → pre_write_backup_taken : IG_backup
pre_write_backup_taken → grants_probed : IG_grant_probe
grants_probed → awaiting_cut_authorization : SG_cut_authz_request
awaiting_cut_authorization → cut_leg_a_committed : SG_cut_authz_received
+ IG_cut_leg_a_execute
cut_leg_a_committed → structural_verified : IG_structural_verify
structural_verified → leg_b_recorded : IG_leg_b_record
leg_b_recorded → write_verified : IG_write_verify
write_verified → awaiting_lifecycle_authorization : SG_lifecycle_authz_request
awaiting_lifecycle_authorization → lifecycle_enacted
: SG_lifecycle_authz_received
+ IG_lifecycle_enact_execute
lifecycle_enacted → closeout_reported : IG_closeout
closeout_reported → (terminal: success) : —
failure_transitions:
<any-state> + invariant_violation → failed_<gate> : (terminal)
<any-state> + sovereign_stop → stopped_<reason> : (resumable iff sovereign re-approves)
<any-state> + drift_detected → stopped_drift : (resumable iff drift sovereign-acknowledged)
3. Sovereign gates — full enumeration
SG_1_cut_authz:
state_in : grants_probed
state_out : cut_leg_a_committed
request : orchestrator uploads "cut-authorization-request" KB doc
with: doc_id, source_version_id, writer_digest, manifest_digest,
region_sha, candidate_count, backup_sha, grants_probe_snapshot,
run_id, sidecar_path
response : sovereign uploads a sibling KB doc with explicit allowance,
returns its KB id; user resumes via
cutter orchestrate resume --run-id <id> --approval-kb-id <path>
refusal : orchestrator refuses to continue without a valid approval doc id
(cryptographic signing is optional in v0.6 — see DQ_2)
SG_2_lifecycle_authz:
state_in : write_verified
state_out : lifecycle_enacted
request : orchestrator uploads "lifecycle-enact-authorization-request" doc
+ creates a *new* row in cutter_governance.review_decision
(review_decision_id captured into run sidecar)
response : sovereign signs the request KB doc; orchestrator validates the
UUID is present, type='lifecycle_enactment', actor matches
refusal : refuses if review_decision_id is reused from a prior run
SG_3_failure_escalation:
state_in : any non-terminal
state_out : failed_<gate> | stopped_<reason>
trigger : invariant violation OR drift OR uncaught exception
produces : KB "stop-route-report" with full sidecar dump + restart command
policy : NEVER silent-retry. NEVER skip a gate. NEVER pull authority
from a prior run's approval.
4. Internal gates — invariant checklist per gate
Each internal gate raises OrchestratorGateFail(<gate>, <reason>) on
any of these conditions (refusal is fail-closed):
IG_source_pin:
- source_document row exists
- latest source_version row exists
- manifest digest recomputed matches source_version.manifest_digest if pinned;
if not pinned, records the freshly computed digest into RunContext
IG_mark:
- region_sha rebuilt matches recorded region_sha
- mark rowset size > 0 and ≤ MAX_CANDIDATES_PER_DOC (deployment limit, e.g. 1000)
IG_cutplan:
- writer_digest stable across two independent rebuilds (replay determinism)
- exactly-N rows produced where N = mark rowset size
- publication_type / unit_kind / section_type vocab map exists for every row
- idempotency_key_set is distinct (cardinality == N)
IG_backup:
- backup target reachable
- GPG public-key fingerprint matches deployment-pinned fpr
- backup .gpg sha256 recorded
- sidecar JSON written with backup_sha + size_bytes + timestamp
IG_grant_probe:
- cutter_exec has EXECUTE on fn_iu_create + fn_iu_apply_edit_draft + fn_iu_enact
- cutter_verify has SELECT + INSERT on cutter_governance.verify_result
- directus has SELECT on cutter_governance.review_decision (for SECDEF probe)
- NO accidental PUBLIC EXECUTE on any cutter_governance writer
IG_cut_leg_a_execute:
- approval-kb-id resolves to a sovereign-signed doc
- approval doc explicitly authorizes (doc_id, writer_digest, change_set_id-pending)
- exactly-N successful fn_iu_create invocations
- transaction COMMITTED (no silent autocommit reset)
- 0 IUs with lifecycle_status != 'draft' after write (per v0.5 lesson)
IG_structural_verify:
- 11-bool probe matches OD-W2..OD-W9 cardinality assertions
- section_type_cardinality matches the cutplan output
- dieu_44_intrusion (or per-doc forbidden-id intrusion) == 0
IG_leg_b_record:
- change_set_id present and unique
- manifest_envelope_id + executor_signature_id present
- G-LEG-B-ONCE: count(rows for change_set_id) == expected, no duplicates
- lane_overlap_invariants assertion PASS
IG_write_verify:
- VerifyRecorder runs with cutter_verify principal
- verify_result row inserted; verifier dot_pair_signature row inserted
- G-VERIFY-ONCE: only one verify_result per change_set_id
IG_lifecycle_enact_execute:
- review_decision_id matches the SG_2 approval doc
- fn_iu_enact called exactly N times in a single txn (per Phase 7 doctrine)
- all N return status='enacted'; 0 partial
- iu_lifecycle_log row count == N
- trigger trg_iu_enacted_immut + trg_uv_enacted_immut both enabled 'O'
IG_closeout:
- all KB docs for this run uploaded
- sidecar finalized (status: success)
- run_id appended to a global runs index (one line per run, KB-tracked)
5. Persistence — JSON sidecar shape
Chosen per DQ_10 = KB + filesystem JSON sidecar (no new DB table; revisit in v0.7 only if backlog grows).
sidecar_root : ${CUTTER_RUNS_DIR:-/var/lib/cutter/runs}
sidecar_path : ${sidecar_root}/<run_id>/state.json
sidecar_lock : ${sidecar_root}/<run_id>/state.lock # POSIX fcntl
backup_blob_path : ${sidecar_root}/<run_id>/pre_write.gpg # encrypted
phase_report_paths : ${sidecar_root}/<run_id>/phase-<NN>.md # mirrored to KB
run_id_format : "ictr-<YYYYMMDDTHHMMSSZ>-<uuid7>" # human-sortable
Sidecar JSON schema (top-level):
run_id : string (== filename parent)
created_utc : ISO8601
created_by : string (actor)
document_id : string (e.g. "ICX-LAW-2026-001")
source_document_id : uuid
source_version_id : uuid
mode : "dryrun" | "live"
state : <one of §1>
phases : map<phase_name, PhaseRecord>
sovereign_approvals : list<ApprovalRecord>
idempotency_keys : map<phase_name, opaque_string>
context_pins : map<key, value> # manifest_digest, region_sha, writer_digest, etc.
last_error : optional<ErrorRecord>
schema_version : 1
PhaseRecord:
phase : enum
started_utc : ISO8601
finished_utc : optional<ISO8601>
result : "passed" | "failed" | "running"
gate_invariants : map<key, boolean>
kb_doc_id : optional<string> # KB path uploaded for this phase
artifacts : list<{path, sha256}>
ApprovalRecord:
gate : "SG_cut_authz" | "SG_lifecycle_authz"
requested_utc : ISO8601
approval_kb_id : string # the KB doc id sovereign uploaded
review_decision_id : optional<uuid> # for SG_2
validated_utc : ISO8601
6. Phase soft-caps + run hard-cap
Per the v0.5 lesson "macro tasks default 45–60 minutes" the orchestrator enforces:
phase_soft_cap_minutes:
source_pin : 2
mark : 5
cutplan : 10
backup : 5
grant_probe : 1
cut_leg_a : 10
structural_verify : 2
leg_b_record : 5
write_verify : 5
lifecycle_enact : 15
closeout : 5
run_hard_cap_minutes : 60
over_cap_action : STOP_AND_ESCALATE (sovereign sees the partial run sidecar)
7. Resume — algorithm
1. Read sidecar JSON (or fail STOP_RUN_NOT_FOUND).
2. Acquire fcntl lock on state.lock (or fail STOP_RUN_BUSY).
3. Inspect `state`:
- if terminal success → exit 0 (idempotent).
- if terminal failure → exit non-zero (operator must void or amend).
- if sovereign-pending and --approval-kb-id provided → validate, advance.
- if mid-phase (state == running on disk) → drift-revalidate the
last completed gate's invariants against live DB. If still hold,
re-enter the next phase. If not, STOP_DRIFT.
4. Continue forward execution from the next state.
5. Release lock on exit.
Resume is idempotent: re-running with the same --run-id and
unchanged live state is a no-op.
8. Drift detection (cross-cutting)
Before every internal gate, the orchestrator re-survey:
- live row count for
canonical_address LIKE <prefix>% - live md5(prosrc) of fn_iu_create / fn_iu_enact / fn_iu_apply_edit_draft
- live cardinality of cutter_governance.review_decision rows for the run's review_decision_id
against the corresponding context_pin in the sidecar. Any mismatch →
STOP_DRIFT_<dim> with a diff payload.
This is the same "drift policy" the lifecycle DDL fingerprints.yaml header declares:
"If live md5 differs from these pins, STOP and route to sovereign — do not silently patch the repo to match."
9. Verdict
g2_outcome : PASS
states_total : 14 (12 non-terminal + 2 terminal categories)
sovereign_gates : 3 (SG_1, SG_2, SG_3)
internal_gates : 11 (IG_*)
sidecar_persistence : JSON + lock file (no new DB table for v0.6)
resume_safety : invariant re-validation before every continuation
drift_policy : STOP_DRIFT before any write
phase_soft_cap_total : 65 min → 60 min hard-cap enforced