KB-6384

FIX7 P0 Production-Readiness Scoping — production-rollback-rehearsal-plan.md

4 min read Revision 1
tool-kiem-thufix7p0production-readinessscoping-packet2026-06-12

FIX7 P0 — Production Rollback / Rehearsal Plan (DESIGN ONLY, 2026-06-12)

Status: DESIGN ONLY. Not run. No production snapshot, restore, apply, or rehearsal was executed in this lane. This document specifies what a future, separately-authorized production rehearsal MUST do before any production apply. It does not authorize that rehearsal.

Blocker this plan addresses (still OPEN): FIX7-P0-DRYRUN-PROD-ROLLBACK-1 — production snapshot/restore is NOT yet proven.

RB-PROD design (production rollback)

step requirement
Before-state capture Capture, read-only, the exact production before-state for every surface the apply touches: PG row counts + content hashes for affected tables (b_documents, birth registry, fn_birth_register outputs), Directus collection snapshot, system_issues baseline, and the canonical executor file sha256 (sql/prod/99_run_all.sql). Use the One-Roof read-only invariant entry == exit (query_pg SELECT-only) to prove the capture itself mutated nothing.
Mutation boundary A single explicit BEGIN … COMMIT/ROLLBACK with \set ON_ERROR_STOP on; INSERT-only where possible; no UPDATE/DELETE/ON CONFLICT unless individually justified; no schema/GRANT/index/Directus changes inside the apply. Tier-0 preflight gate must hold (db=directus AND os_proposal_approvals>=1).
Rollback trigger Any of: parity divergence vs expected delta; false_done detected; preflight gate fails; row counts diverge from captured before-state; integrity hash mismatch of the apply file vs the KB-approved package.
Rollback command/action Explicit ROLLBACK; of the open transaction (proven pattern: BIRTH Stage-2.5 executed BEGIN..ROLLBACK, restored, birth count before==after). For already-committed changes: restore affected rows/collections from the before-state snapshot (clone-validated restore script).
Verification After rollback: re-capture state and assert byte/row-exact match to before-state pins; permits/ledger empty; done==0; birth count before==after; canonical executor sha256 unchanged.
Forbidden fail-open cases (1) emitting a production PASS while the gate is blocked; (2) declaring rollback clean without re-capturing and comparing to before-state; (3) leaving an open transaction; (4) restoring from an unverified snapshot; (5) skipping the entry==exit read-only proof; (6) treating "no error" as "rolled back". Each must FAIL the rehearsal.

Rehearsal environment (required before production)

  • Run first on an isolated clone DB (One-Roof "clone safety": isolated DB, rollback-only revalidation, harness retained), never on production, to prove snapshot→apply→rollback→restore is byte/row-exact.
  • Only after a clone rehearsal PASS, and only with production OPT-4 + a distinct production-rollback authorization, may a production rehearsal be considered.

What is NOT covered / still required

  • Actual snapshot/restore tooling for production Directus PG is not built or proven here.
  • This plan is paper; FIX7-P0-DRYRUN-PROD-ROLLBACK-1 remains OPEN until a rehearsal PASS exists.
Back to Knowledge Hub knowledge/dev/reports/architecture/fix7-p0-production-readiness-surface-scoping-packet-2026-06-12/production-rollback-rehearsal-plan.md