KB-1763
09 — Final gate decision
4 min read Revision 1
c1-stagingclaude-r3-self-gateverdictready-for-codex-r3
09 — Final gate decision
Verdict
CLAUDE_R3_SELF_GATE_PASS_READY_FOR_EXTERNAL_CODEX_R3
18 attack scenarios — final disposition
| # | attack | result |
|---|---|---|
| A1 | c1_staging_x; rm -rf / reaches shell |
refuted (regex; live exit 4) |
| A2 | sandbox_db differs from sandbox_id | refuted (same $SBX; plan verifies both) |
| A3 | plan runs without CODEX_R3_PASS | refuted (gate exit 64) |
| A4 | P1 creates DB then fails before JSON → orphan | refuted (compensating drop / exit 70) |
| A5 | P1 emits JSON before postconditions | refuted (JSON then P1_DONE after all postconds) |
| A6 | P2 fails but plan exits 0 | refuted (cleanup_rc 86/87 force nonzero) |
| A7 | P2 drops official/system DB | refuted (regex+denylist+re-assert; live exit 4) |
| A8 | --force blind-drops active sandbox | refuted (--force disabled; live exit 4) |
| A9 | bad input accepted but P5 marks PASS | refuted (accepted→pass=false) |
| A10 | unexpected exception marks PASS | refuted (no exact match→pass=false) |
| A11 | P6 digest exists without P5 success | refuted (gate before digest; ON_ERROR_STOP) |
| A12 | digest omits harness rows | refuted (harness_md5 + combined_md5) |
| A13 | official table write hidden in SQL | refuted (in-sandbox 42P01; no directus target; before==after) |
| A14 | staging runner writes official registry | refuted (staging JSONL; no dot_tools INSERT) |
| A15 | hardcoded secret/token | refuted (NO_SECRETS_FOUND) |
| A16 | evidence PASS after partial failure | refuted (P6 gate + DRY_RUN_OK only post-P2/count0) |
| A17 | TTL claimed automatic without timer | refuted (README/ROLLBACK: advisory, no timer) |
| A18 | plan hides nonzero exit via trap/echo | refuted (exit matrix + pipefail) |
R3-SELF-1 (plan adopts/drops a sandbox it did not create) was confirmed and fixed — it sits at the A4/A7 boundary (cleanup-target safety). All 18 now refuted/NA.
Final report
- verdict:
CLAUDE_R3_SELF_GATE_PASS_READY_FOR_EXTERNAL_CODEX_R3 - defect found during hostile self-review: YES (1, MEDIUM: R3-SELF-1)
- what was fixed: plan no longer arms its cleanup/drop target until P1 succeeds, and refuses a pre-existing candidate up-front → it can never drop a sandbox it did not create.
- patched files + sha256: plan
9d356085…, READMEb4eb6198…, ROLLBACKfa47c4de…, ledgera3225aa0…(4 files, all under /opt/incomex/staging/c1/; 6 primitives + _common.sh + SQL byte-identical to R2). - 35-row matrix: 35/35 PASS (file 02).
- 18 attacks: 18/18 refuted (table above).
- official runtime before==after: YES (file 07; db_list_hash dfc368f6… unchanged).
- staging_DBs = 0: YES (snapshots + post-guard-test).
- dry-run executed: NO.
- any staging DB created: NO.
- would Claude reject this patch as Codex? NO — after the R3-SELF-1 fix, every blocker class Codex R1/R2/R3 exercises (injection, sbx propagation, partial cleanup, P2 swallow, drop guard, force provenance, P5 fail-open, P6 false-PASS, official isolation, stamping, TTL honesty) is closed with code + live evidence, and the one residual concern was fixed. No remaining issue rises to a reject.
- ready for external Codex R3: YES.
- ready to run dry-run without Codex: NO.
- ready for promotion: NO.
- ready for production: NO.
Next step
Next step: external Codex R3 review only. Do not run dry-run until external Codex returns
CODEX_PASS_C1_STAGING_READY_FOR_FAST_DRY_RUN.