KB-7967

dot-iu-cutter v0.5 — S2E Evidence Summary: S2 report VERIFIED ACCURATE; full-suite failure is pre-existing baseline (2026-05-19)

4 min read Revision 1

dot-iu-cutterv0.5cut-plan-entrypoints2-evidenceevidence-summarys2-report-verified-accuratepre-existing-baseline2026-05-19

S2E Evidence Summary

Goal: verify whether the S2 report is accurate and whether the full-suite failure is truly pre-existing baseline behavior. Evidence-only. 2026-05-19.

Answers to the 7 required evidence questions

1_targeted_suite_passes:            YES — tests.test_cutplan_snapshot: Ran 15, OK, exit 0.
2_full_suite_fails_with_S2_present: YES — discover: Ran 128, FAILED (failures=1), exit 1.
3_full_suite_also_fails_S2_removed: YES — pure baseline afb7bfc (2 S2 files moved aside):
                                    discover: Ran 113, FAILED (failures=1), exit 1.
4_failure_location_identical:       YES — both: test_security_boundaries
   .TestNoSecretPrinted.test_source_has_no_hardcoded_dsn_or_secret, file
   tests/test_security_boundaries.py line 118, AssertionError assertNotIn('PGPASSWORD',
   <module source>). Only the rglob-first offending file differs: dryrun.py at baseline
   (the RATIFIED committed MARK entrypoint), cutplan.py when S2 present — same guard
   pattern, same defect class.
5_only_2_untracked_files_present:   YES — git ls-files --others --exclude-standard =
   cutter_agent/cutplan.py + tests/test_cutplan_snapshot.py only; git status --short shows
   exactly those two as ?? ; git diff --stat empty.
6_anything_committed_or_modified:   NO — HEAD afb7bfcc9b7bbb953bb00159479c9611e6ac4bd1
   unchanged throughout; zero tracked-file modifications; restoration byte-exact (sha256
   identical pre/post: cutplan.py 548eabc5… , test 06e871e7…).
7_test_count_delta_consistent:      YES — 128 - 113 = 15 = exactly the new cutplan suite;
   nothing else changed in collection.

Verdict on S2 report accuracy

s2_report_claim__targeted_suite_15_15_green:        CONFIRMED (15/15 OK).
s2_report_claim__full_suite_red_with_S2:            CONFIRMED (128, 1 failure).
s2_report_claim__failure_pre_existing_at_baseline:  CONFIRMED — with S2 removed the
   IDENTICAL test at the IDENTICAL file:line with the IDENTICAL assertion still FAILS,
   pointing at the ratified, already-committed cutter_agent/dryrun.py (afb7bfc).
s2_report_claim__no_regression_introduced_by_S2:    CONFIRMED — the failure class exists
   without S2 at all; cutplan.py merely mirrors dryrun.py's ratified DB-env guard pattern
   (an env-var NAME the code refuses to read — not a hardcoded credential/secret value).
s2_report_claim__nothing_committed_or_mutated:      CONFIRMED — HEAD unchanged, no tracked
   diffs, only 2 untracked files, byte-exact restoration.
overall: THE S2 REPORT IS ACCURATE. The full-suite single failure is genuinely
   pre-existing baseline behavior, not an S2 regression.

Nature of the pre-existing defect (factual, no remediation performed)

test: tests/test_security_boundaries.py:118 asserts NO non-test *.py contains the literal
  substring of the libpq password env-var name.
reality: the ratified MARK entrypoint cutter_agent/dryrun.py:474 lists that env-var NAME
  inside a fail-closed DB-env refusal guard tuple — it is the name of a variable the code
  REFUSES to read, never a stored secret/value. The heuristic substring check is
  over-broad and pre-dates S2; it was outside the MARK CI gate of record
  (tests.test_dryrun_snapshot_mark, 21/21), so it shipped latent at afb7bfc.
no_action_taken: did NOT edit dryrun.py, cutplan.py, or the security test; did NOT skip or
  weaken anything; no fix-to-green. Evidence-only, as scoped.

Nothing edited, committed, executed against production, or self-advanced.