KB-12B9

Acceptance Test Matrix rev3 — Implementation Package DOT v0.1 (no green verdict; expanded bypass-path negative tests; PG read-only + discoverability + local-last; design only, 2026-06-09)

15 min read Revision 1
tool-kiem-thuimplementation-package-dotacceptance-test-matrixrev3article-14no-fake-greencapability-testbypass-pathspg-read-onlydiscoverabilitylocal-lastdesign-only2026-06-09

Acceptance Test Matrix (rev3) — Implementation Package DOT v0.1

Nature: the fail-closed acceptance tests for the future read/report-only MVP, repaired after the Codex re-seal to cover the four returned blocker classes — taxonomy authority (no green/no shadow SSOT), structural no-run/no-write feasibility, FIX7 artifact discoverability, and expanded bypass-path negative tests. Uses the rev3 verdict model (no READ_LEVEL_ACCEPTABLE, no exit 0). Counts appear only inside fixtures named as fixtures; no literal count is a production invariant. KB-first / PG-first / native-driven / local-last. Date: 2026-06-09 · Supersedes: designs/acceptance-test-matrix-implementation-package-dot-v0-1-rev2-2026-06-09.md (rev2). Retained for trace. Status: ACCEPTANCE_MATRIX_v0_1_REV3_READY_FOR_CODEX. Design only; no test is executed here; no command runs. Production mutation: NO. writes_performed: KB design docs only. Governing authority: rev3 Gap-only Scope Spec (§2 triage-only, §3 chain, §4 verdicts, §11 exit, §12 capability) + MVP plan rev3 gates G1–G11.

1. Verdict legend (rev3)

Final dossier verdicts: READ_LEVEL_FAIL / BLOCKED / UNVERIFIED (READ_LEVEL_ACCEPTABLE removed). Per-claim: NO_READ_LEVEL_DEFECT_FOUND (NON_AUTHORITATIVE) / EVIDENCE_INSUFFICIENT / EVIDENCE_CONFLICTING / BLOCKED_BY_NO_CALL_CONTRACT / BLOCKED_BY_UNVERIFIED_SOURCE. Flags: FLAG_PROSE_ONLY_PASS / FLAG_HARDCODED_DENOMINATOR / FLAG_AUTHORITY_VIOLATION / FLAG_LOCAL_FIRST_AUTHORITY. article14_status ∈ {NOT_APPLICABLE_NO_EXECUTABLE_CLAIMS, NOT_PROVEN_EXECUTION_UNVERIFIED}. triage_outcome ∈ {BLOCKING_FINDINGS, NO_BLOCKING_FINDING_BUT_UNCERTIFIED}. No READ_REPORT_PASS / READ_LEVEL_ACCEPTABLE / positive EVIDENCE_PRESENT / exit 0 anywhere (removed).

Fixture discipline: every fixture is tagged FIXTURE with as_of example values; a fixture value is never a production invariant. Tests compare surface role / match key / population / provenance / separation / verdict / governed-surface, never literal counts.

2. Core matrix (verdict + article14 + criterion)

# Test Input condition (FIXTURE) Expected verdict + article14 Pass/fail criterion
1 No green verdict exists static scan of outputs/enums n/a PASS iff READ_LEVEL_ACCEPTABLE and exit 0 never appear in any output/enum/path; strongest result observed is UNVERIFIED (rev3 §2/§11)
2 Any execution claim caps the dossier FIXTURE: 1 execution claim + otherwise clean UNVERIFIED + NOT_PROVEN PASS iff no state above UNVERIFIED is reachable; article14 forced NOT_PROVEN
3 Removed tokens absent static scan n/a PASS iff READ_REPORT_PASS, positive EVIDENCE_PRESENT, EVIDENCE_SUFFICIENT_FOR_READ_LEVEL, READ_LEVEL_ACCEPTABLE never appear
4 Executable claim without governed existence evidence FIXTURE: "canonicalizer exists/runs" + no resolvable governed artifact READ_LEVEL_FAIL/UNVERIFIED + NOT_PROVEN PASS iff EVIDENCE_INSUFFICIENT; existence sub-verdict BLOCKED_BY_UNVERIFIED_SOURCE; never "exists/ran"
5 Selftest PASS with no run ledger/exit/log FIXTURE: "selftest 22/22 PASS" + no LOG/EXIT_CODE/RUN_LEDGER READ_LEVEL_FAIL + NOT_PROVEN PASS iff EVIDENCE_INSUFFICIENT; tool does NOT run selftest
6 Hash claim without pinned hash evidence FIXTURE: "reproduces hash <example>" + no HASH_EVIDENCE READ_LEVEL_FAIL + NOT_PROVEN PASS iff EVIDENCE_INSUFFICIENT; tool does NOT recompute
7 Exit-code claim without exit-code evidence FIXTURE: "exit 0" + no EXIT_CODE_EVIDENCE READ_LEVEL_FAIL + NOT_PROVEN PASS iff EVIDENCE_INSUFFICIENT; never assume exit 0
8 Command string without call contract FIXTURE: "command X ran safely" + no Call Contract UNVERIFIED/READ_LEVEL_FAIL + NOT_PROVEN PASS iff BLOCKED_BY_NO_CALL_CONTRACT + routed to Call Contract; tool makes no call
9 Collapsed denominator (no >=2 rule) FIXTURE: report cites one canonical DOT number BLOCKED PASS iff blocked; criterion = "all relevant denominators distinct + provenanced; none collapsed" — no numeric minimum
10 TAC/IU chosen instead of dual-report FIXTURE: dossier joins/chooses a corpus BLOCKED PASS iff blocked; criterion = distinct surfaces + independent provenance + joined==false; no literal corpus count
11 Reconciliation: diagnostic must not override canonical FIXTURE: name-keyed diagnostic presented as overriding canonical READ_LEVEL_FAIL PASS iff canonical.match_key != diagnostic.match_key, both shown, diagnostic non-overriding; no literal 41/4
12 Stale/unverified source used as denominator FIXTURE: actual_count external-sync / local checkout used as denominator UNVERIFIED PASS iff BLOCKED_BY_UNVERIFIED_SOURCE + held out + marked stale; never a denominator
13 Prose-only PASS FIXTURE: prose asserts success, no evidence artifact READ_LEVEL_FAIL PASS iff FLAG_PROSE_ONLY_PASS; tool never re-asserts
14 Dead-link/coverage over-claim FIXTURE: "all references resolved" UNVERIFIED PASS iff coverage==ADVISORY_UNVERIFIED; no resolver-completeness claim
15 Contract status over-claimed FIXTURE: treats a READY_FOR_GPT_REVIEW contract as binding READ_LEVEL_FAIL PASS iff FLAG_AUTHORITY_VIOLATION (status assessed at recorded value only)
16 Evidence artifacts contradict FIXTURE: two records, exit 0 vs exit 2 READ_LEVEL_FAIL + NOT_PROVEN PASS iff EVIDENCE_CONFLICTING; reports the conflict set, never picks
17 Claim extractor misses high-risk prose FIXTURE: executable claim hidden in an unparsed region UNVERIFIED PASS iff region in UNPARSED_REGION[] (high) → completeness UNVERIFIED → manual review

3. Taxonomy-authority / shadow-SSOT tests (rev3 — Codex B-1)

# Test Input condition (FIXTURE) Expected Pass/fail criterion
18 Inspector taxonomy treated as authority FIXTURE: a downstream consumer cites the inspector's classifier/verdict as governing truth FLAG_AUTHORITY_VIOLATION ⇒ FAIL PASS iff the taxonomy carries PROVISIONAL_NON_AUTHORITY + version + source, never certifies truth, and any "authority" use is flagged (F20)
19 Positive verdict attempted without governed taxonomy source FIXTURE: a build/output attempts READ_LEVEL_ACCEPTABLE or exit 0 build/output rejected PASS iff no positive verdict / exit 0 is emittable; promotion requires a separate sealed authority contract

4. PG read-only guard tests (rev3 — Codex B-2 / Track 4)

# Test Input condition (FIXTURE) Expected Pass/fail criterion
20 Connected role is read-only, verified in report runtime n/a PASS iff read_access_provenance.connected_role ∈ verified read-only set (context_pack_readonly) and txn_read_only=on are recorded; else BLOCKED
21 PG write through the read client FIXTURE: INSERT/UPDATE/DELETE/DDL/CALL submitted refused PASS iff server-refused by role + read-only transaction + AST validation; exit 3
22 Multi-statement SQL FIXTURE: two statements in one call rejected PASS iff the statement classifier/gateway rejects it
23 SELECT side-effect function FIXTURE: a function with side effects, not read-only-allowlisted rejected PASS iff rejected (no read-only allowlist entry)
24 Unverifiable read-only role FIXTURE: connected role not in verified set BLOCKED PASS iff fail-closed before any read (F16)

5. Capability / bypass-path tests (rev3 — Codex B-4, EXPANDED)

# Test Input condition (FIXTURE) Expected Pass/fail criterion
25 Module declares a prohibited action FIXTURE: module plan declaring EXECUTE_COMMAND/INVOKE_DOT build rejected PASS iff CONTRACT_VIOLATION at static guard (G4)
26 Shell/subprocess attempt FIXTURE: probe calling os.system/subprocess.run/os.exec*/spawn/pty refused PASS iff no capability exists; build rejected (G4) and runtime exit 3
27 Dynamic import / plugin load FIXTURE: probe calling importlib/__import__ refused PASS iff no dynamic-import capability; build rejected (G4)
28 General network egress off-allowlist FIXTURE: probe to any endpoint ≠ KB read connector / PG read gateway refused PASS iff egress denied; grounded by the server-side DB allowlist (non-allowlisted DB returns [DENIED])
29 Credential / environment secret access FIXTURE: probe reading a secret/env credential refused PASS iff no ACCESS_CREDENTIAL_SECRET capability; no DB credential held
30 Direct DB driver opened FIXTURE: probe opening a raw psql/asyncpg/JDBC connection build rejected PASS iff CONTRACT_VIOLATION (G4); only the governed gateway is permitted
31 Filesystem write outside report output FIXTURE: probe writing outside the KB report path; and a run where writes_performed[] omits a write refused / fail PASS iff write restricted to the report path AND writes_performed[] enumerates every write (no hidden mutation)
32 Directus write FIXTURE: probe attempting a Directus write refused PASS iff no write credential present; exit 3
33 Exit 0 attempted FIXTURE: any verdict wired to exit 0 build failure PASS iff G8 rejects exit 0 (none exists in v0.1)

6. Local-first / provenance tests (rev3 — Track 1, KB-first/local-last)

# Test Input condition (FIXTURE) Expected Pass/fail criterion
34 Local source used as authority where a KB/PG source exists FIXTURE: a load-bearing claim sourced from a local path with a KB/PG equivalent READ_LEVEL_FAIL PASS iff FLAG_LOCAL_FIRST_AUTHORITY; marks CONFLICT, prefers KB/PG (F18)
35 Consumed fact with no governed source_metadata FIXTURE: a fact/count with no KB path / PG view / native surface citation READ_LEVEL_FAIL PASS iff flagged; every consumed fact must cite a governed surface

7. FIX7 discoverability tests (rev3 — Codex B-3)

# Test Input condition (FIXTURE) Expected Pass/fail criterion
36 FIX7 Recheck-8 real dossier (Fixture A) FIXTURE A: .py SSOT declared; only a wrong-kind .md resolves; selftest/exit/hash asserted as prose READ_LEVEL_FAIL + NOT_PROVEN PASS iff execution claims EVIDENCE_INSUFFICIENT, existence sub-verdict BLOCKED_BY_UNVERIFIED_SOURCE (NOT "does not exist"), C4/C5 fire; no command run; no positive verdict
37 Pure discoverability (Fixture A′, NEW) FIXTURE A′: "executable X exists"; no governed surface can resolve X; no prose-only PASS, no contradiction UNVERIFIED + NOT_PROVEN PASS iff BLOCKED_BY_UNVERIFIED_SOURCE, NOT READ_LEVEL_FAIL, and the report states "not adequately evidenced via allowed surfaces," not global absence (Codex correction #4)
38 FIX7 resolvable-but-insufficient (Fixture C) FIXTURE C: cited evidence resolves but is prose-only / wrong-kind / contradictory / unbound READ_LEVEL_FAIL + NOT_PROVEN PASS iff C5/C6/C7 fire; must NOT be NO_READ_LEVEL_DEFECT_FOUND / acceptable / PASS
39 FIX7 stripped (Fixture B) FIXTURE B: success asserted, all references removed READ_LEVEL_FAIL + NOT_PROVEN PASS iff C1/C2/C4/C8 = EVIDENCE_INSUFFICIENT + FLAG_PROSE_ONLY_PASS

8. Cross-cutting acceptance invariants (all tests)

  • I1: no test path invokes a command, FS DOT, IU command, detector, shell, subprocess, or dynamic import.
  • I2: no test path mutates PG/Directus/registry/filesystem/system_issues; PG access is read-only via the governed gateway only; no direct DB driver/credential.
  • I3: every emitted count carries a full denominator_source_record; no bare counts; no literal count is a comparator.
  • I4: no positive/green verdict and no exit 0 exist; READ_LEVEL_ACCEPTABLE/READ_REPORT_PASS/positive EVIDENCE_PRESENT never appear.
  • I5: denominators stay separate; TAC/IU never joined.
  • I6: "any doubt ⇒ FAIL/BLOCK/UNVERIFIED"; no silent acceptance; FLAG/FAIL/BLOCKED/UNVERIFIED map to exit 1/2/3, never 0.
  • I7: every fixture is tagged FIXTURE with as_of; a fixture value is never a production invariant.
  • I8: writes_performed[] enumerates every write (no hidden mutation).
  • I9 (rev3): every consumed fact + load-bearing claim cites a governed KB/PG/native surface; local-first authority is flagged.
  • I10 (rev3): the inspector's taxonomy is PROVISIONAL_NON_AUTHORITY; it never certifies truth; no artifact-existence result claims global absence.
  • I11 (rev3): read_access_provenance (connected role, txn-read-only, per-query text/hash/source/purpose) is recorded for every run.

9. Deferred tests (NOT in v0.1 — gated on named future contracts)

  • D1 — actual command run + exit-code capture (Call Contract).
  • D2 — claim bound to a real execution result / re-run determinism / global-absence proof (Call / Proof-of-run Contract).
  • D3 — generic package_manifest schema validation (lineage + Codex schema review).
  • D4 — --selftest N/N + module_sha256 (post-reseal build).
  • D5 — audit_dead_links()system_issues (write contract).
  • D6 — Directus write-path verification (DOT-control proof contract).
  • D7 — OPA/Conftest/Squawk/CI/Git-hook gating (CI/policy-gate contract).
  • D8 (rev3)positive/green verdict + exit 0 (sealed governed claim/evidence/verdict taxonomy authority).

10. Acceptance verdict

ACCEPTANCE_MATRIX_v0_1_REV3_READY_FOR_CODEX39 in-scope tests with deterministic fail-closed criteria covering all four Codex re-seal blocker classes: taxonomy/shadow-SSOT (#1/#3/#18/#19), PG read-only feasibility (#20–#24, #30), expanded bypass paths (#25–#33), FIX7 discoverability honesty (#36/#37), local-last (#34/#35); plus the preserved Article-14 / hardcode / fake-green coverage (#2/#4–#17/#38/#39). 8 deferred tests behind named future contracts. No positive verdict, no exit 0, no literal count invariant. Routed with the rev3 packet to Codex re-review.

Cross-references

  • Gap-only Spec rev3 / FIX7 pilot rev3 / MVP plan rev3 / fix ledger rev3 (see those docs).
  • Codex re-seal: reviews/codex-reseal-gap-only-spec-rev2-2026-06-09.md.
  • Superseded rev2: designs/acceptance-test-matrix-implementation-package-dot-v0-1-rev2-2026-06-09.md.
Back to Knowledge Hub knowledge/dev/laws/tool-kiem-thu/designs/acceptance-test-matrix-implementation-package-dot-v0-1-rev3-2026-06-09.md