Acceptance Test Matrix rev2 — Implementation Package DOT v0.1 (fail-closed, fixtures named as fixtures, covers Codex failure modes, design only, 2026-06-09)
Acceptance Test Matrix (rev2) — Implementation Package DOT v0.1
Nature: the fail-closed acceptance tests for the future read/report-only MVP, repaired after the Codex block to cover every Codex failure mode. Uses the rev2 verdict model. Counts appear only inside fixtures explicitly named as fixtures; no literal count is a production invariant or expected-output comparator (Codex fixes 5/7). Date: 2026-06-09 · Supersedes:
designs/acceptance-test-matrix-implementation-package-dot-v0-1-2026-06-09.md(rev1). Status:ACCEPTANCE_MATRIX_v0_1_REV2_READY_FOR_CODEX. Design only; no test is executed here; no command runs. Production mutation: NO.writes_performed: KB design docs only. Governing authority: rev2 Gap-only Scope Spec (§3 chain, §4 verdicts, §11 exit, §12 capability) + MVP plan rev2 gates G1–G9.
1. Verdict legend
Final dossier verdicts: READ_LEVEL_ACCEPTABLE / READ_LEVEL_FAIL / BLOCKED / UNVERIFIED. Per-claim: EVIDENCE_SUFFICIENT_FOR_READ_LEVEL / EVIDENCE_INSUFFICIENT / EVIDENCE_CONFLICTING / BLOCKED_BY_NO_CALL_CONTRACT / BLOCKED_BY_UNVERIFIED_SOURCE. Flags: FLAG_PROSE_ONLY_PASS / FLAG_HARDCODED_DENOMINATOR / FLAG_AUTHORITY_VIOLATION. article14_status ∈ {NOT_APPLICABLE_NO_EXECUTABLE_CLAIMS, NOT_PROVEN_EXECUTION_UNVERIFIED}. No READ_REPORT_PASS / EVIDENCE_PRESENT(positive) anywhere (removed).
Fixture discipline: every fixture is tagged FIXTURE with as_of example values; a fixture value is never a production invariant. Tests compare surface role / match key / population / provenance / separation / verdict, never literal counts.
2. Matrix (columns: # · test · input condition (fixture) · expected final verdict + article14_status · pass/fail criterion)
| # | Test | Input condition (FIXTURE) | Expected verdict + article14 | Pass/fail criterion |
|---|---|---|---|---|
| 1 | Positive read-level (no execution claims) | FIXTURE: dossier with only reference/authority/denominator claims, all sufficient, inventory complete-by-governed-contract | READ_LEVEL_ACCEPTABLE + NOT_APPLICABLE_NO_EXECUTABLE_CLAIMS |
PASS iff ACCEPTABLE reached only when zero execution claims + zero FLAG + completeness != UNVERIFIED |
| 2 | Any execution claim caps the dossier | FIXTURE: dossier with 1 execution claim + otherwise clean | UNVERIFIED + NOT_PROVEN_EXECUTION_UNVERIFIED |
PASS iff ACCEPTABLE is structurally unavailable; article14 forced NOT_PROVEN |
| 3 | READ_REPORT_PASS must not exist |
static scan of outputs | n/a | PASS iff the token READ_REPORT_PASS and positive EVIDENCE_PRESENT never appear in any output |
| 4 | Executable claim without artifact existence evidence | FIXTURE: "canonicalizer exists/runs" + no resolvable artifact | READ_LEVEL_FAIL + NOT_PROVEN |
PASS iff claim EVIDENCE_INSUFFICIENT; never sufficient/accepted |
| 5 | Selftest PASS with no run ledger / exit code / log | FIXTURE: "selftest 22/22 PASS" + no LOG/EXIT_CODE/RUN_LEDGER evidence |
READ_LEVEL_FAIL + NOT_PROVEN |
PASS iff EVIDENCE_INSUFFICIENT; tool does NOT run selftest |
| 6 | Hash claim without pinned hash evidence | FIXTURE: "reproduces hash <example>" + no HASH_EVIDENCE |
READ_LEVEL_FAIL + NOT_PROVEN |
PASS iff EVIDENCE_INSUFFICIENT; tool does NOT recompute |
| 7 | Exit-code claim without exit-code evidence | FIXTURE: "exit 0" + no EXIT_CODE_EVIDENCE |
READ_LEVEL_FAIL + NOT_PROVEN |
PASS iff EVIDENCE_INSUFFICIENT; never assume exit 0 |
| 8 | Command string present without call contract | FIXTURE: "command X ran safely" + no Call Contract (none exists) | READ_LEVEL_FAIL/UNVERIFIED + NOT_PROVEN |
PASS iff BLOCKED_BY_NO_CALL_CONTRACT + routed to Call Contract; tool makes no call |
| 9 | Collapsed denominator (no >=2 rule) |
FIXTURE: report cites one canonical DOT number | BLOCKED |
PASS iff blocked as disguised hardcode; criterion = "all relevant denominators distinct + provenanced; none collapsed" — no numeric minimum |
| 10 | TAC/IU chosen instead of dual-report | FIXTURE: dossier joins/chooses a corpus as canonical | BLOCKED |
PASS iff blocked; criterion = corpora read from distinct surfaces with independent provenance, joined==false; no literal corpus count in criterion |
| 11 | Reconciliation: diagnostic must not override canonical | FIXTURE: name-keyed diagnostic presented as overriding canonical code-keyed diff | READ_LEVEL_FAIL |
PASS iff criterion = canonical.match_key != diagnostic.match_key, both shown, diagnostic non-overriding; no literal 41 or 4 in criterion |
| 12 | Stale/unverified source used as denominator | FIXTURE: actual_count external-sync artifact / local checkout used as a denominator |
UNVERIFIED |
PASS iff BLOCKED_BY_UNVERIFIED_SOURCE + held out + marked stale; never a denominator |
| 13 | Prose-only PASS | FIXTURE: prose asserts success, no evidence artifact | READ_LEVEL_FAIL |
PASS iff FLAG_PROSE_ONLY_PASS; tool never re-asserts the prose PASS |
| 14 | Module declares a prohibited action | FIXTURE: a module plan declaring EXECUTE_COMMAND or INVOKE_DOT |
build rejected | PASS iff CONTRACT_VIOLATION at static guard (G4); never builds |
| 15 | Runtime PG-write attempt | FIXTURE: a probe attempting a PG write under the inspector role | refused | PASS iff read-only role context_pack_readonly blocks it; exit 3 |
| 16 | Runtime Directus-write attempt | FIXTURE: a probe attempting a Directus write | refused | PASS iff no write credential present; exit 3 |
| 17 | Write outside approved KB path / hidden mutation | FIXTURE: a probe writing outside the KB allowlist; and a run where writes_performed[] omits a write |
refused / fail | PASS iff write restricted to allowlist AND writes_performed[] enumerates every write (no hidden evidence-output mutation — Codex fix 12) |
| 18 | Exit 0 fake-green attempted | FIXTURE: a FLAG/FAIL/BLOCKED/UNVERIFIED verdict wired to exit 0 | build/CI failure | PASS iff G8 rejects any non-ACCEPTABLE→exit-0 mapping |
| 19 | Dead-link/coverage over-claim | FIXTURE: dead-link reporter asserting "all references resolved" | UNVERIFIED |
PASS iff coverage==ADVISORY_UNVERIFIED; no canonical-id/resolver-completeness claim |
| 20 | FIX7 Recheck-8: declared executable missing | FIXTURE A (pilot): .py SSOT declared, does not resolve; selftest/exit/hash asserted |
READ_LEVEL_FAIL + NOT_PROVEN |
PASS iff C1 fires (EVIDENCE_INSUFFICIENT); no canonicalizer/command run, no PASS |
| 21 | FIX7 Recheck-8: resolvable-but-insufficient evidence (counter-fixture) | FIXTURE C (pilot): cited evidence documents resolve but are prose-only with no identity/exit/log/hash, OR contain a contradictory/unbound run record | READ_LEVEL_FAIL + NOT_PROVEN |
PASS iff C5/C6/C7 fire; must NOT be EVIDENCE_PRESENT / ACCEPTABLE / PASS (the case rev1 could not catch) |
| 22 | FIX7 stripped dossier (no references) | FIXTURE B (pilot): success asserted, all references removed | READ_LEVEL_FAIL + NOT_PROVEN |
PASS iff C1/C2/C4/C8 = EVIDENCE_INSUFFICIENT |
| 23 | Claim extractor misses high-risk prose | FIXTURE: dossier with an executable claim hidden in an unparsed prose region | UNVERIFIED |
PASS iff region listed in UNPARSED_REGION[] (high risk) → claim_inventory_completeness=UNVERIFIED → ACCEPTABLE unavailable + manual_review_required=true |
| 24 | Contract status over-claimed | FIXTURE: dossier treats a READY_FOR_GPT_REVIEW contract as binding |
READ_LEVEL_FAIL |
PASS iff FLAG_AUTHORITY_VIOLATION (AUTHORITY_CONTRACT_EVIDENCE assessed at recorded status only; never upgraded) |
| 25 | Evidence artifacts contradict | FIXTURE: two records, exit 0 vs exit 2 | READ_LEVEL_FAIL + NOT_PROVEN |
PASS iff EVIDENCE_CONFLICTING; tool reports the conflict set, never picks |
3. Cross-cutting acceptance invariants (all tests)
- I1: no test path invokes a command, FS DOT, IU command, or detector.
- I2: no test path mutates PG/Directus/registry/filesystem/
system_issues. - I3: every emitted count carries a full
denominator_source_record; no bare counts; no literal count is a comparator. - I4: no positive verdict carries proof-of-run;
EVIDENCE_PRESENT(positive)andREAD_REPORT_PASSnever appear. - I5: denominators stay separate; TAC/IU never joined.
- I6: "any doubt ⇒ FAIL/BLOCK/UNVERIFIED"; no silent acceptance; FLAG/FAIL/BLOCKED/UNVERIFIED never exit 0 (G8).
- I7: every fixture is tagged
FIXTUREwithas_of; a fixture value is never a production invariant. - I8:
writes_performed[]enumerates every write (no hidden evidence-output mutation).
4. Deferred tests (NOT in v0.1 — gated on named future contracts)
- D1 — actual command run + exit-code capture (Call Contract).
- D2 — claim bound to a real execution result / re-run determinism (Call / Proof-of-run Contract).
- D3 — generic
package_manifestschema validation (lineage decision + Codex schema review). - D4 —
--selftest N/Nself-report +module_sha256self-pin (post-reseal build). - D5 —
audit_dead_links()engine sinking tosystem_issues(system_issueswrite contract). - D6 — Directus write-path verification (Directus DOT-control proof contract).
- D7 — OPA/Conftest/Squawk/CI/Git-hook gating (CI/policy-gate integration contract).
5. Acceptance verdict
ACCEPTANCE_MATRIX_v0_1_REV2_READY_FOR_CODEX — 25 in-scope tests with deterministic fail-closed criteria covering every Codex failure mode (inadequate-but-resolvable evidence #21, missing-artifact-identity #20, high-risk-unparsed #23, hardcoded-denominator-fixture #9/#12, exit-0 fake-green #18, module-declares-prohibited-action #14, contract-overclaim #24, selftest-no-ledger #5, command-no-call-contract #8); 7 deferred tests behind named future contracts. No literal count is a production invariant. Routed with the rev2 packet to Codex re-review.
Cross-references
- Gap-only Spec rev2 / FIX7 pilot rev2 / MVP plan rev2 / fix ledger (see those docs).
- Superseded rev1:
designs/acceptance-test-matrix-implementation-package-dot-v0-1-2026-06-09.md.