KB-6C94

Acceptance Test Matrix rev2 — Implementation Package DOT v0.1 (fail-closed, fixtures named as fixtures, covers Codex failure modes, design only, 2026-06-09)

11 min read Revision 1
tool-kiem-thuimplementation-package-dotacceptance-test-matrixrev2article-14no-hardcodeno-fake-greencapability-testcounter-fixturedesign-only2026-06-09

Acceptance Test Matrix (rev2) — Implementation Package DOT v0.1

Nature: the fail-closed acceptance tests for the future read/report-only MVP, repaired after the Codex block to cover every Codex failure mode. Uses the rev2 verdict model. Counts appear only inside fixtures explicitly named as fixtures; no literal count is a production invariant or expected-output comparator (Codex fixes 5/7). Date: 2026-06-09 · Supersedes: designs/acceptance-test-matrix-implementation-package-dot-v0-1-2026-06-09.md (rev1). Status: ACCEPTANCE_MATRIX_v0_1_REV2_READY_FOR_CODEX. Design only; no test is executed here; no command runs. Production mutation: NO. writes_performed: KB design docs only. Governing authority: rev2 Gap-only Scope Spec (§3 chain, §4 verdicts, §11 exit, §12 capability) + MVP plan rev2 gates G1–G9.

1. Verdict legend

Final dossier verdicts: READ_LEVEL_ACCEPTABLE / READ_LEVEL_FAIL / BLOCKED / UNVERIFIED. Per-claim: EVIDENCE_SUFFICIENT_FOR_READ_LEVEL / EVIDENCE_INSUFFICIENT / EVIDENCE_CONFLICTING / BLOCKED_BY_NO_CALL_CONTRACT / BLOCKED_BY_UNVERIFIED_SOURCE. Flags: FLAG_PROSE_ONLY_PASS / FLAG_HARDCODED_DENOMINATOR / FLAG_AUTHORITY_VIOLATION. article14_status ∈ {NOT_APPLICABLE_NO_EXECUTABLE_CLAIMS, NOT_PROVEN_EXECUTION_UNVERIFIED}. No READ_REPORT_PASS / EVIDENCE_PRESENT(positive) anywhere (removed).

Fixture discipline: every fixture is tagged FIXTURE with as_of example values; a fixture value is never a production invariant. Tests compare surface role / match key / population / provenance / separation / verdict, never literal counts.

2. Matrix (columns: # · test · input condition (fixture) · expected final verdict + article14_status · pass/fail criterion)

# Test Input condition (FIXTURE) Expected verdict + article14 Pass/fail criterion
1 Positive read-level (no execution claims) FIXTURE: dossier with only reference/authority/denominator claims, all sufficient, inventory complete-by-governed-contract READ_LEVEL_ACCEPTABLE + NOT_APPLICABLE_NO_EXECUTABLE_CLAIMS PASS iff ACCEPTABLE reached only when zero execution claims + zero FLAG + completeness != UNVERIFIED
2 Any execution claim caps the dossier FIXTURE: dossier with 1 execution claim + otherwise clean UNVERIFIED + NOT_PROVEN_EXECUTION_UNVERIFIED PASS iff ACCEPTABLE is structurally unavailable; article14 forced NOT_PROVEN
3 READ_REPORT_PASS must not exist static scan of outputs n/a PASS iff the token READ_REPORT_PASS and positive EVIDENCE_PRESENT never appear in any output
4 Executable claim without artifact existence evidence FIXTURE: "canonicalizer exists/runs" + no resolvable artifact READ_LEVEL_FAIL + NOT_PROVEN PASS iff claim EVIDENCE_INSUFFICIENT; never sufficient/accepted
5 Selftest PASS with no run ledger / exit code / log FIXTURE: "selftest 22/22 PASS" + no LOG/EXIT_CODE/RUN_LEDGER evidence READ_LEVEL_FAIL + NOT_PROVEN PASS iff EVIDENCE_INSUFFICIENT; tool does NOT run selftest
6 Hash claim without pinned hash evidence FIXTURE: "reproduces hash <example>" + no HASH_EVIDENCE READ_LEVEL_FAIL + NOT_PROVEN PASS iff EVIDENCE_INSUFFICIENT; tool does NOT recompute
7 Exit-code claim without exit-code evidence FIXTURE: "exit 0" + no EXIT_CODE_EVIDENCE READ_LEVEL_FAIL + NOT_PROVEN PASS iff EVIDENCE_INSUFFICIENT; never assume exit 0
8 Command string present without call contract FIXTURE: "command X ran safely" + no Call Contract (none exists) READ_LEVEL_FAIL/UNVERIFIED + NOT_PROVEN PASS iff BLOCKED_BY_NO_CALL_CONTRACT + routed to Call Contract; tool makes no call
9 Collapsed denominator (no >=2 rule) FIXTURE: report cites one canonical DOT number BLOCKED PASS iff blocked as disguised hardcode; criterion = "all relevant denominators distinct + provenanced; none collapsed" — no numeric minimum
10 TAC/IU chosen instead of dual-report FIXTURE: dossier joins/chooses a corpus as canonical BLOCKED PASS iff blocked; criterion = corpora read from distinct surfaces with independent provenance, joined==false; no literal corpus count in criterion
11 Reconciliation: diagnostic must not override canonical FIXTURE: name-keyed diagnostic presented as overriding canonical code-keyed diff READ_LEVEL_FAIL PASS iff criterion = canonical.match_key != diagnostic.match_key, both shown, diagnostic non-overriding; no literal 41 or 4 in criterion
12 Stale/unverified source used as denominator FIXTURE: actual_count external-sync artifact / local checkout used as a denominator UNVERIFIED PASS iff BLOCKED_BY_UNVERIFIED_SOURCE + held out + marked stale; never a denominator
13 Prose-only PASS FIXTURE: prose asserts success, no evidence artifact READ_LEVEL_FAIL PASS iff FLAG_PROSE_ONLY_PASS; tool never re-asserts the prose PASS
14 Module declares a prohibited action FIXTURE: a module plan declaring EXECUTE_COMMAND or INVOKE_DOT build rejected PASS iff CONTRACT_VIOLATION at static guard (G4); never builds
15 Runtime PG-write attempt FIXTURE: a probe attempting a PG write under the inspector role refused PASS iff read-only role context_pack_readonly blocks it; exit 3
16 Runtime Directus-write attempt FIXTURE: a probe attempting a Directus write refused PASS iff no write credential present; exit 3
17 Write outside approved KB path / hidden mutation FIXTURE: a probe writing outside the KB allowlist; and a run where writes_performed[] omits a write refused / fail PASS iff write restricted to allowlist AND writes_performed[] enumerates every write (no hidden evidence-output mutation — Codex fix 12)
18 Exit 0 fake-green attempted FIXTURE: a FLAG/FAIL/BLOCKED/UNVERIFIED verdict wired to exit 0 build/CI failure PASS iff G8 rejects any non-ACCEPTABLE→exit-0 mapping
19 Dead-link/coverage over-claim FIXTURE: dead-link reporter asserting "all references resolved" UNVERIFIED PASS iff coverage==ADVISORY_UNVERIFIED; no canonical-id/resolver-completeness claim
20 FIX7 Recheck-8: declared executable missing FIXTURE A (pilot): .py SSOT declared, does not resolve; selftest/exit/hash asserted READ_LEVEL_FAIL + NOT_PROVEN PASS iff C1 fires (EVIDENCE_INSUFFICIENT); no canonicalizer/command run, no PASS
21 FIX7 Recheck-8: resolvable-but-insufficient evidence (counter-fixture) FIXTURE C (pilot): cited evidence documents resolve but are prose-only with no identity/exit/log/hash, OR contain a contradictory/unbound run record READ_LEVEL_FAIL + NOT_PROVEN PASS iff C5/C6/C7 fire; must NOT be EVIDENCE_PRESENT / ACCEPTABLE / PASS (the case rev1 could not catch)
22 FIX7 stripped dossier (no references) FIXTURE B (pilot): success asserted, all references removed READ_LEVEL_FAIL + NOT_PROVEN PASS iff C1/C2/C4/C8 = EVIDENCE_INSUFFICIENT
23 Claim extractor misses high-risk prose FIXTURE: dossier with an executable claim hidden in an unparsed prose region UNVERIFIED PASS iff region listed in UNPARSED_REGION[] (high risk) → claim_inventory_completeness=UNVERIFIED → ACCEPTABLE unavailable + manual_review_required=true
24 Contract status over-claimed FIXTURE: dossier treats a READY_FOR_GPT_REVIEW contract as binding READ_LEVEL_FAIL PASS iff FLAG_AUTHORITY_VIOLATION (AUTHORITY_CONTRACT_EVIDENCE assessed at recorded status only; never upgraded)
25 Evidence artifacts contradict FIXTURE: two records, exit 0 vs exit 2 READ_LEVEL_FAIL + NOT_PROVEN PASS iff EVIDENCE_CONFLICTING; tool reports the conflict set, never picks

3. Cross-cutting acceptance invariants (all tests)

  • I1: no test path invokes a command, FS DOT, IU command, or detector.
  • I2: no test path mutates PG/Directus/registry/filesystem/system_issues.
  • I3: every emitted count carries a full denominator_source_record; no bare counts; no literal count is a comparator.
  • I4: no positive verdict carries proof-of-run; EVIDENCE_PRESENT(positive) and READ_REPORT_PASS never appear.
  • I5: denominators stay separate; TAC/IU never joined.
  • I6: "any doubt ⇒ FAIL/BLOCK/UNVERIFIED"; no silent acceptance; FLAG/FAIL/BLOCKED/UNVERIFIED never exit 0 (G8).
  • I7: every fixture is tagged FIXTURE with as_of; a fixture value is never a production invariant.
  • I8: writes_performed[] enumerates every write (no hidden evidence-output mutation).

4. Deferred tests (NOT in v0.1 — gated on named future contracts)

  • D1 — actual command run + exit-code capture (Call Contract).
  • D2 — claim bound to a real execution result / re-run determinism (Call / Proof-of-run Contract).
  • D3 — generic package_manifest schema validation (lineage decision + Codex schema review).
  • D4 — --selftest N/N self-report + module_sha256 self-pin (post-reseal build).
  • D5 — audit_dead_links() engine sinking to system_issues (system_issues write contract).
  • D6 — Directus write-path verification (Directus DOT-control proof contract).
  • D7 — OPA/Conftest/Squawk/CI/Git-hook gating (CI/policy-gate integration contract).

5. Acceptance verdict

ACCEPTANCE_MATRIX_v0_1_REV2_READY_FOR_CODEX — 25 in-scope tests with deterministic fail-closed criteria covering every Codex failure mode (inadequate-but-resolvable evidence #21, missing-artifact-identity #20, high-risk-unparsed #23, hardcoded-denominator-fixture #9/#12, exit-0 fake-green #18, module-declares-prohibited-action #14, contract-overclaim #24, selftest-no-ledger #5, command-no-call-contract #8); 7 deferred tests behind named future contracts. No literal count is a production invariant. Routed with the rev2 packet to Codex re-review.

Cross-references

  • Gap-only Spec rev2 / FIX7 pilot rev2 / MVP plan rev2 / fix ledger (see those docs).
  • Superseded rev1: designs/acceptance-test-matrix-implementation-package-dot-v0-1-2026-06-09.md.
Back to Knowledge Hub knowledge/dev/laws/tool-kiem-thu/designs/acceptance-test-matrix-implementation-package-dot-v0-1-rev2-2026-06-09.md