KB-5875

T2 UI-Current Audit — 04 Regression-Test Risk

4 min read Revision 1
terminal2auditregressiontestsinvariantno-false-green2026-06-05

04 · Regression-Test Risk

The next macro must not be able to "go green" while binding a stale or false-green surface. These are the tests that must exist and the conditions that must fail loudly.

What must be RUN before UI deploy (all read-only / query_pg RO + grep)

# Test Expected Live status today
T1 _current row count = 87 ✓ (resolves to reliability, 87)
T2 _current reliability fields non-null reliability_label, source_scope, confidence_score, lane_code, count_semantics present & populated ✓ columns present
T3 Invariant has 0 unexplained FAIL OR every FAIL is rendered non-green currently 2 FAIL_COUNT_SUBSTRATE_MISMATCH 2 live FAILs (PROC:new_candidates 6≠50, PROC:residual_reconcile 8≠23)
T4 Computed proof = invariant (no literal verdict) verdict_is_computed=true, fail_demonstrated_in_data=true ✓ (…_proof_matrix_computed is per-axis computed; rule_can_structurally_fail present)
T5 UI routes bind _current only grep server/api/**,pages/**,composables/**0 refs to _v1/_v2/_reliability/…_contract (unversioned-but-non-current) UNVERIFIED_SOURCE_ACCESS (and existing packages bind versioned names → would FAIL today)
T6 Deploy guard a single guard view returns PASS guard view does not exist

What SHOULD FAIL if stale v1 is used — the critical negative test

Point the invariant (or the bound contract) at v_rp_universal_node_ui_contract (v1): the prior T2 design predicts 12 FAIL_* (10 AX-PXT NEEDS_GROUPING-but-SHOW_SUBSTRATE + 2 AX-PROCESS dead-end). The regression harness must assert: binding → v1 ⇒ ≥12 FAIL; binding → _current/reliability ⇒ exactly the 2 known count-substrate FAILs (or 0 after T1 substrate-fix). If a build binds v1 and the suite still passes, the suite is broken.

What SHOULD FAIL if invariant has a real FAIL

The deploy-guard view (to be built) must FAIL on any invariant_status LIKE 'FAIL_%' that is not on an explicit allow-list. Today 2 FAILs are live and there is no gate — so a deploy could ship with 2 real count-substrate mismatches and nothing would stop it. Either T1 closes them (substrate-fix-v2, the absent checkpoint) or they go on an explicit "known-FAIL, rendered-red" allow-list with a tracking ticket.

What SHOULD FAIL if reliability fields are missing

Guard must assert 0 rows where reliability_label IS NULL OR source_scope IS NULL OR drill_action IS NULL OR next_route IS NULL. A count must never render without source_scope + reliability_label (no bare numbers).

False-green trap specific to this surface — must be a test

The 2 FAIL nodes carry reliability_label='CANDIDATE' in _current (doc 01 R2). A naive "badge = reliability_label" UI shows them green. Test: for every node where invariant.invariant_status LIKE 'FAIL_%', the rendered status chip MUST be non-green. This test can only pass if the renderer binds invariant_status, not just reliability_label.

Regression verdict

MANAGEABLE IF GUARDED. The tests are well-defined and mostly checkable read-only today, but two enablers are missing: (1) the deploy-guard view that wires them, and (2) source access to run the grep gate (T5). Until both exist, "UI productionization passed" cannot be trusted.

Back to Knowledge Hub knowledge/dev/reports/architecture/parallel-terminal2-ui-current-autoscale-generator-deploy-risk-audit-2026-06-05/04-regression-test-risk.md