KB-6DE5

dot-iu-cutter v0.4 PG-backed Dry-run — EXECUTION FAIL: RealPostgresAdapter txn/pool defect (2026-05-17)

8 min read Revision 1
dot-iu-cutterv0.4db-adapterdry-runexecution-failadapter-defectdieu44

dot-iu-cutter v0.4 — PG-backed Dry-run: EXECUTION FAIL (honest)

Date: 2026-05-17 · execution_status: FAIL (honest). Real accepted-code defect found by the PG-backed dry-run. Orchestrator STOPped at the failing gate, tore down the isolated env exact-name only, did not improvise, did not mark PASS. Production untouched.

1. Outcome

The PG-backed dry-run of the accepted code (commit 84c52c57aa296de921998910d85b0d4a85ad0746, FINAL CODE PASS) provisioned a faithful isolated restore and ran scenarios. S1, S2, S3(neg) PASSED. S4 MARK failed at the first real phase transaction with psycopg3 error:

SET TRANSACTION ISOLATION LEVEL must be called before any query

Gate G_HARNESS_HAPPY failed ⇒ STOP ⇒ teardown ⇒ honest FAIL. The r3 15-row baseline was never reached (no scenario past S3 ran; no row written).

2. Root cause — accepted RealPostgresAdapter transaction/pool defect (NOT harness / NOT r3 / NOT schema-binding)

cutter_agent/phases.py calls adapter.find(...) outside a transaction before with adapter.transaction() (the accepted call pattern — present in the original phases.py and preserved unchanged by the schema-binding rebind). In cutter_agent/db_adapter.py (untouched by the schema-binding cycle — identical at 56d373284c52c5):

  • RealPostgresAdapter.find() when not in a txn does conn = self._pool.acquire(); conn.execute("SELECT …") and returns the connection to _ConnPool without commit/rollback.
  • _default_provider sets conn.autocommit = False, so that pre-txn SELECT opens a transaction that stays open on the pooled connection.
  • The next phase's transaction()._begin re-acquires the same pooled connection (pool max 1) and issues SET TRANSACTION ISOLATION LEVEL <level> as the first statement — psycopg3 rejects it because a transaction is already open from the earlier find() SELECT.

This is a genuine runtime transaction-lifecycle defect in the accepted adapter (RealPostgresAdapter.find() / _ConnPool / autocommit=False interaction). It is latent against the unit suite because the accepted tests inject FakeConn, which does not model psycopg3's "SET TRANSACTION ISOLATION must precede any query" rule — so 101/101 stayed green. A real psycopg3 connection is the first context that exposes it. This is precisely the class of latent gap the PG-backed dry-run exists to catch.

It is not: a harness defect (the harness drives the accepted adapter via its public API exactly as phases.py does; the tracing wrapper faithfully delegates execute/commit/rollback to the real psycopg3 connection, equivalent to _default_provider); an r3/verification-spec defect (counting never reached); or a schema-binding defect (the rebind preserved the original find-before-transaction structure; the defect lives in the untouched db_adapter.py).

3. Required output

Field Value
execution_status FAIL (honest; accepted-adapter txn/pool defect)
dry-run env name / sysid pg-dry-run-v0.4-db-adapter-2026-05-17 / DR system_identifier = 7640730715875692588
production sysid pre/post 7611578671664259111 / 7611578671664259111 (unchanged)
backup path + sha …/.dryrun-v0.4-2026-05-17/prod-directus-20260517T053712Z.sql · sha256 6c1a18f2a8510eafb64b15e66807678140822ca90afa6088467a24d139ea4c77 · 674,099,794 bytes (read-only pg_dump -U workflow_admin; plain-format ⇒ sha differs per run by design)
SQL/harness artefact shas roles+matrix 2a409696dc3f60cb6328a77afd345e7638685f8d70cb5c0995b40f5841a57584 · harness ddf14a94438a6b8ed621329d2f3b62ca7da2b58724d6fd363136a0f1c8d3aa96 · orchestrator 0c7b2a62b489bf6cfed968e53ea633fc8e98e6e40f4e3ea9971537c9b432e9bc
psycopg3 install scope ephemeral python:3.12-slim --rm container only (pip install "psycopg[binary]>=3.1,<3.4"); never host/incomex-*; gone with the container
scenario S1→S12 S1 PASS (config, no connect) · S2 PASS (ConfigMissing names key, no socket) · S3-neg PASS (current_user≠principal ⇒ ConfigMismatch, no write) · S4 MARK FAIL (adapter txn/pool defect) · SWEEP/S5/S6/S7/S8/S9/S10/S11/S12 not reached
12-table matrix not measured — fail before any write (no MARK row)
final 15-row baseline not reached (FAIL before baseline)
negative Δ0 not reached (S1–S3 wrote nothing by construction; SQL trace verbs = {SELECT, SET} only ⇒ zero writes, zero forbidden surface)
teardown DR container + network removed exact-name (docker rm -f -v / network rm); DR_GONE confirmed; secrets (dr.env, dr.harness.env) shredded
protected envs untouched pg-dry-run-v0.2-p0-2-2026-05-16, pg-dry-run-v0.2-phase-alpha-2026-05-16, pg-dry-run-hb05-2026-05-15 all running, Id+StartedAt unchanged (before==after)
production untouched ✓ only read-only pg_dump; prod postgres StartedAt 2026-04-17T05:35:18.48439927Z unchanged; prod sysid 7611578671664259111 pre==post; no adapter/secret/.env/CUT/VERIFY
no forbidden SQL ✓ harness SQL trace verbs = {SELECT, SET} only — no DELETE/TRUNCATE/DROP/ALTER/GRANT/CREATE/COPY; no write reached
no secret leakage ✓ generated dry-run-only passwords (openssl), dr.env 0600, shredded at teardown; orchestrator/harness never echo secrets; logs scanned
git SSOT branch main; HEAD 84c52c57aa296de921998910d85b0d4a85ad0746 (unchanged); git status --short -- iu-cutter clean (throwaway WD excluded via .git/info/exclude, not git-added); no code change, no commit this cycle
hardcode control no literal IP/DSN/password/container/vector collection introduced; connection via dry-run-only env; no hardcode finding

4. Honest-FAIL discipline

Per the gate rules: STOP (done), teardown exact dry-run env only (done), do not improvise (I did not patch the harness to commit/rollback after find() to paper over it, and did not modify db_adapter.py — that is a code change forbidden in this dry-run cycle and would mask a real defect), report honest FAIL, do not mark PASS. Three earlier interim STOPs in this run were orchestrator-artefact self-check bugs (G_DIRTY WD-path; G07 exposed-vs-published port semantics; a set -u unbound $PORTS) — each corrected in the throwaway tooling with prod/protected untouched and any isolated env removed exact-name; they were tooling false-negatives, distinct from this substantive S4 adapter defect which is a true accepted-code finding and is not worked around.

5. Remediation (NOT self-applied — needs GPT authorisation)

Fix lives in cutter_agent/db_adapter.py — the file this cycle was told to avoid; therefore a new, separate GPT-gated code-authoring cycle is required. Candidate fixes (for that cycle's design): out-of-transaction reads use an autocommit/read-only connection; or find() commits/rollbacks the implicit txn before returning the conn to _ConnPool; or _begin issues ROLLBACK (or checks conn.info.transaction_status) before SET TRANSACTION ISOLATION LEVEL. Also add a real-psycopg integration test (or a FakeConn that enforces the isolation-before-query rule) so the unit suite catches this. Then re-author the PG-backed dry-run. No self-advance.

6. Next gate

GPT review of this honest-FAIL report and adjudication of the db_adapter.py remediation cycle. PG-backed dry-run remains blocked/failed until the adapter defect is fixed under a separate GPT-gated cycle and re-authorised. No production action performed; no PASS claimed.

Back to Knowledge Hub knowledge/dev/laws/dieu44-trien-khai/v0.4-db-adapter-dry-run/dot-iu-cutter-v0.4-pg-backed-dry-run-EXECUTION-FAIL-adapter-txn-defect-2026-05-17.md