dot-iu-cutter v0.4 PG-backed Dry-run — Verification Plan (2026-05-17) [r3]
dot-iu-cutter v0.4 — PG-backed Dry-run VERIFICATION PLAN
Date: 2026-05-17 · Planning only — defines how a future authorized run would be judged. Nothing executed.
Revision 3 (2026-05-17). GPT reviewed the agent's pre-execution HALT and ruled: agent halt = PASS / CORRECT; execution remains BLOCKED_PRE_EXECUTION; root cause = verification-plan r2 spec defect (adapter defect = false; code change required = false; commit
56d3732remains the accepted, unchanged HEAD); r2 execution forbidden; r3 required before any execution. r3 corrects the §2 matrix to the accepted-code reality at56d3732: the canonical happy path is MARK → SWEEP → REVIEW → CUT → VERIFY (sweep()is the only producer of themarked → review_pendingtransition thatreview()hard-requires —cli.py:13/_run_pipeline,state_machine.ALLOWED_TRANSITIONS[S_MARKED], and the accepted 92/92 suitetest_one_transaction_per_phaseasserting 5 committed txns: "mark + sweep + review = 3"). r3 adds SWEEP as an explicit happy-path step, adds an "After SWEEP" matrix column, and sets the exact final baseline to 15 rows (decision_backlog_history=5,decision_backlog_sweep_log=1). Scope = verification/design clarification only. Gate catalogue G-14…G-25, false-negative defences, verdict rule, and all negative/idempotency Δ=0 assertions are unchanged. Sibling command-review and report do not mention "total=13" and do not state a happy-path flow that omits SWEEP (they defer scenario detail to design-master §5), so per GPT's instruction they are left unchanged. Boundaries (unchanged): no execution, no provisioning, no code change, no commit, no production DB connection, no secret read, no.envedit, no dry-run, no deploy, no CUT/VERIFY, no self-advance.
1. Gate catalogue (G-14 … G-25, continuing command-review)
| Gate | Check | Method (false-negative-safe) | Pass criterion |
|---|---|---|---|
| G-14 | 12 base tables + 12 observe views present in cutter_governance |
pg_class/pg_namespace catalog count, not parsed \dt text |
base==12, views==12 |
| G-15 | Per-scenario row-count matrix (see §2) | SELECT count(*) per table, captured as integers (not grep -c) |
matches the exact §2 matrix cell-for-cell |
| G-16 | Happy-path lineage intact | FK-join MARK→…→VERIFY rows by entry_id; structural |
one coherent chain, statuses terminal verified_complete |
| G-17 | Privilege matrix unchanged after run | aclexplode(relacl) + information_schema.role_table_grants set-equality vs frozen 33 table-priv + 3 col-UPDATE; grant_option=0 |
exact set match; no extra/missing |
| G-18 | cutter_exec cannot write verify_result; cutter_verify cannot write review_decision/manifest_* |
controlled raw attempt → server 42501 |
42501 observed; no row written |
| G-19 | No forbidden SQL emitted | parse harness SQL trace; classify every statement verb | only SET/SELECT/INSERT/UPDATE; zero DELETE/TRUNCATE/DROP/ALTER/GRANT/CREATE/COPY |
| G-20 | Atomicity (S11) | row counts before == after the induced-failure phase | delta == 0 for that phase |
| G-21 | No production touch | PROD system_identifier re-read == C-03 value; prod container StartedAt unchanged; zero statements ever sent to prod (harness env points only at DR DNS) |
identical; no prod session |
| G-22 | DR ≠ PROD identity | DR system_identifier ≠ 7611578671664259111 |
distinct |
| G-23 | No secret leakage | scan $WD artefacts/logs/KB draft for the dry-run password byte-strings; assert only <redacted-secret> / key-names; dr.env perms 0600, not git-tracked |
zero secret hits |
| G-24 | Structured-log redaction | each JSON log line carries only {sqlstate,error_class,phase,table,entry_id,key_name}; password regex absent |
redaction holds |
| G-25 | Protected prior dry-run envs untouched | docker inspect Id + State.StartedAt of the 3 prior containers == C-01 snapshot |
byte-identical |
2. Expected row-count matrix — EXACT (execution-gating)
2.1 Canonical happy-path flow & fixture determinism
Canonical flow (accepted code @56d3732): MARK → SWEEP → REVIEW → CUT → VERIFY. SWEEP is a mandatory, named step, not an optional cycle: review() hard-requires decision_backlog_entry.status == review_pending; mark() only yields marked; the only marked → review_pending edge is the sweep-promote transition (state_machine.ALLOWED_TRANSITIONS[S_MARKED]), performed only by CutterRuntime.sweep() (cli.py:13 docstring "MARK->sweep->REVIEW(approve)->CUT->VERIFY"; cli.py _run_pipeline; tests/test_phase_contracts.py _approved_entry = mark();sweep();review() and test_one_transaction_per_phase asserting 5 committed txns).
The dry-run drives a single pre-seeded canonical fixture: proposed_cut_spec = {"units":[{"k":1}]} → StubCanonicalization.resolve yields exactly one UnitBlockPlan (canonicalization.py). Consequences that make every cell exact:
decision_backlog_dependency= 0 throughout — standalone fixture, no dependency edges authored.decision_backlog_sweep_log= 1 final —sweep()callsled.append_sweep_log(...)exactly once per sweep call; the canonical flow performs exactly one SWEEP. Value is 0 after MARK (before sweep) and 1 from After-SWEEP onward. Verification asserts the exact value per column; any other value is a FAIL.canonical_address_alias= 0 throughout — alias writes fully deferred in v0.4 (OD-2;canonicalization.py— no alias writer; no alias write path in MARK/SWEEP/REVIEW/CUT/VERIFY). Verification assertscount(*) == 0; non-zero = FAIL.decision_backlog_history= 5 final — exactly one append-only history row per ledger transition: MARK (append_historyBIRTH→marked) = 1; SWEEP (transition_statusmarked→review_pending) = 1; REVIEW (review_pending→reviewed_approved) = 1; CUT (reviewed_approved→cut_applied) = 1; VERIFY (cut_applied→verified_complete) = 1 ⇒ 5.dot_pair_signature= 2 final — CUT writes the exec-DOT signature (DOT-991,verifier_signature_idNULL) → 1; VERIFY writes the verify-DOT signature (DOT-992) → 2.manifest_unit_block= 1 — single-unit canonical fixture (consistent withcut_change_set=1 /cut_change_set_affected_row=1).- VERIFY runs the pass branch: no compensating
cut_change_set, no escalationdecision_backlog_entry(those occur only onforce_fail).
2.2 The exact 12-table matrix
All 12 cutter_governance base tables named explicitly; every cell is an exact non-negative integer (no ≥, no …, no placeholder). Counts are cumulative committed row totals in the isolated DR env after each happy-path step commits. The After SWEEP column is the mandatory promotion step between MARK and REVIEW.
| # | Table | After S4 MARK | After SWEEP | After S5 REVIEW | After S6 CUT | After S7 VERIFY (pass) |
|---|---|---|---|---|---|---|
| 1 | decision_backlog_entry |
1 | 1 | 1 | 1 | 1 |
| 2 | decision_backlog_history |
1 | 2 | 3 | 4 | 5 |
| 3 | decision_backlog_dependency |
0 | 0 | 0 | 0 | 0 |
| 4 | decision_backlog_sweep_log |
0 | 1 | 1 | 1 | 1 |
| 5 | manifest_envelope |
0 | 0 | 1 | 1 | 1 |
| 6 | manifest_unit_block |
0 | 0 | 1 | 1 | 1 |
| 7 | review_decision |
0 | 0 | 1 | 1 | 1 |
| 8 | dot_pair_signature |
0 | 0 | 0 | 1 | 2 |
| 9 | cut_change_set |
0 | 0 | 0 | 1 | 1 |
| 10 | cut_change_set_affected_row |
0 | 0 | 0 | 1 | 1 |
| 11 | verify_result |
0 | 0 | 0 | 0 | 1 |
| 12 | canonical_address_alias |
0 | 0 | 0 | 0 | 0 |
| running total | 2 | 4 | 8 | 12 | 15 |
2.3 Final happy-path baseline (exact — the execution gate)
After S7 VERIFY(pass) the isolated DR env must hold exactly these totals (cell-for-cell PASS gate for G-15):
| Table | Exact final count |
|---|---|
decision_backlog_entry |
1 |
decision_backlog_history |
5 |
decision_backlog_dependency |
0 |
decision_backlog_sweep_log |
1 |
manifest_envelope |
1 |
manifest_unit_block |
1 |
review_decision |
1 |
dot_pair_signature |
2 |
cut_change_set |
1 |
cut_change_set_affected_row |
1 |
verify_result |
1 |
canonical_address_alias |
0 |
Total committed rows across all 12 tables = 15. Any deviation in any single cell ⇒ G-15 FAIL ⇒ STOP + teardown + honest failure report.
2.4 Negative & idempotency scenarios — exact delta 0 (unchanged)
Each non-happy-path scenario must produce delta = 0 on every one of the 12 tables versus the committed state immediately preceding that scenario (fail-closed / in-txn ROLLBACK; no partial writes):
| Scenario | Phase / nature | Assertion |
|---|---|---|
| S2 | Missing-key fail-closed (no socket) | Δ = 0 on all 12 tables |
| S3 (negative) | current_user mismatch → ConfigMismatch, no write |
Δ = 0 on all 12 tables |
| S8 | Principal-capability refusal before SQL | Δ = 0 on all 12 tables |
| S9 | 42501 server-deny → PhaseStop + rollback |
Δ = 0 on all 12 tables |
| S10 | Idempotency replay of MARK (find()/key dedup) |
Δ = 0 on all 12 tables (no increment; decision_backlog_entry stays 1) |
| S11 | Induced mid-phase failure → whole-phase ROLLBACK | Δ = 0 on all 12 tables (atomicity; cross-checked by G-20) |
S1 (config load, no connect) and S12 (append-only/no-DDL surface assertion) perform no row writes ⇒ Δ = 0 on all 12 tables by construction. Any non-zero delta on any negative/idempotency scenario ⇒ FAIL.
3. False-negative defences (recurring lesson — unchanged)
Explicitly engineered against prior cycles' harness false-negatives:
- Structural, not string: privilege/constraint comparisons use
aclexplodeset-equality and catalog queries — neverpg_get_constraintdef()string compare (it emits schema-qualifiedREFERENCES; caused a needless prod rollback before) and never parsed psql text. - Safe counters: integers via
psql -tAc 'select count(*)'into a variable; nevergrep -c P f || echo 0(double-emit "0\n0" breaks[ -eq ]). Each §2 cell is compared as an integer equality, not a substring match. - NOTICE vs error: restore/role NOTICEs (
role … does not exist,context_pack_readonly) are benign (Note-N1) and explicitly whitelisted, not failures. - Multi-statement parsing: SQL trace classified per-statement after split, not via a single regex over the whole file.
- sha-gating: every SQL/python artefact sha256-verified pre-run; verdicts quote the sha.
4. Verdict rule
Run = PASS iff every scenario S1–S12 (including the mandatory SWEEP promotion step in the canonical MARK→SWEEP→REVIEW→CUT→VERIFY flow) meets its expected outcome AND the §2.3 exact final baseline holds cell-for-cell (total = 15) AND every §2.4 negative/idempotency delta is exactly 0 AND G-14…G-25 all PASS. Any structural failure ⇒ STOP + teardown + honest failure report (no "self-check bug flips a good run" — a suspected harness FN must be proven structurally before any rollback decision).