08 — Defect found + fix (R3-SELF-1)
08 — Defect found + fix (R3-SELF-1)
Did the hostile self-review find a defect? YES — one (MEDIUM).
R3-SELF-1 — "the dry-run plan can drop a sandbox it did not create"
Attack class: cleanup-target safety (the exact class Codex flagged in R2-1).
Pre-fix behavior. The plan derived a minute-resolution candidate id
CAND="c1_staging_$(date -u +%Y%m%d_%H%M)" and set SBX="$CAND" (the EXIT-trap drop target)
before invoking P1. If two gated dry-runs were launched in the same UTC minute, both compute the
SAME CAND. Run #2's P1 fails with REUSE_BLOCK (exit 4, since --force is disabled), but run #2's
cleanup() then calls dot-staging-sandbox-drop --sandbox-id CAND. Run #1's sandbox has an active
sbx_meta.sandbox_registry row, so the governed drop guard permits the drop — destroying run #1's
live sandbox. The drop guard (active-registry requirement) does not protect here because the victim
is legitimately active.
Severity. MEDIUM. The dry-run is single-operator, env-gated (CODEX_R3_PASS), and not yet run, so
the race is narrow — but the consequence (silently destroying a concurrent run's sandbox) violates the
saga's core invariant "never destroy what you did not create," and it is squarely in the class Codex
rejects. Under the owner directive ("fix anything Codex would likely reject"), it was fixed, not
documented-and-accepted.
Fix (plan only).
- (a) Refuse a pre-existing candidate up-front (exit 74) so the plan never adopts a concurrent /
leftover sandbox: a read-only
select count(*) … datname='$CAND';'' != '0'(unreadable) also aborts (fail-closed). - (b) Arm
SBX(the drop target) only after P1 returns success.set -eguarantees the arming line runs only if theP1_OUT="$(… create …)"command substitution returned 0 — and P1 returns 0 only afterP1_DONE=1(it created the DB and emittedcreated=true). A P1REUSE_BLOCK/validation failure aborts the plan before arming, socleanup()seesSBX=""and drops nothing. P1's own EXIT trap remains the primary cleanup for any partial create.
This preserves the genuine R2-1 benefit (the plan still cleans up when P1 created the DB but the plan
later fails to parse the JSON — SBX is armed right after P1 success, before the parse).
Changed files (exactly 4, all under /opt/incomex/staging/c1/)
plan/c1-staging-fast-dry-run.plan.sh 9d35608548a91db8717d8221a97f9d0d9b907e4a782cb0e62225304451750153
README.md b4eb6198f3ca884728fcc7d6a2964d1b69fbc0f26bc9ca5bba396d2f0041c74e
ROLLBACK.md fa47c4de14c6aeb9eefb166c34f78c19f8869d3f53d91d3928f8c2d75529daa7
ledger/dot_manage.jsonl a3225aa0d03ab81099b6e58570ec0a30481ac8cf46636de322f8059431bf30a8
Unchanged (byte-identical to R2): all 6 primitive bins, _common.sh, all 6 SQL payloads,
registry/primitives.jsonl, admission/. No registered primitive touched → registry sha256
cross-check still 6/6. No scope drift.
Re-run of the ENTIRE self-gate after the fix (per §7)
bash -n 8/8 bins + plan OK
shellcheck -S warning CLEAN (bin/* + plan)
injection grep NO_NONCOMMENT_INJECTION_HITS
guard self-tests 14/14 expected exit codes; staging_DBs after = 0
registry sha256 ALL_REGISTRY_SHA256_MATCH (6/6)
ledger 10 entries, all valid JSON
official runtime DATA before == after (db_list_hash unchanged dfc368f6…)
staging_DBs 0
dry-run executed NO ; staging DB created NO
Other observations (NOT blockers)
- P6 orphan sweep covers tables/functions/triggers, not sequences/views; the only sequences present are identity-owned (auto-managed) and no views are created — no false-PASS risk. Noted, not a defect.
- SIGKILL between CREATE DATABASE and trap-arming can leak an orphan that no trap can catch; it is a
detectable
c1_staging_%DB that the next create's reuse-block surfaces. Recoverable, not a false-PASS. Noted, not a defect.