11000x · 05 — Healthcheck 8th surface: piece_event_runtime
11000x · 05 — 8th healthcheck surface
What's added
cutter_agent/iu_core/healthcheck.py (the 5000x deploy-ready healthcheck) is bumped from 7 → 8 surfaces. The new 8th surface, piece_event_runtime, wraps the in-DB fn_iu_piece_event_runtime_healthcheck() so the same Python report covers it.
| Surface | Source | Verdict rule |
|---|---|---|
| 1. three_axis_cache | live drift view | unchanged |
| 2. directus_collection | iu_three_axis_envelope + permissions | unchanged |
| 3. qdrant_collection | iu_vector_sync_point indexed count | unchanged |
| 4. auto_refresh_trigger | refresh_log + error_log | unchanged |
| 5. vector_boundary | per-IU chunk_index NOT EXISTS GROUP BY HAVING | unchanged |
| 6. write_gates | 6 dot_config gate keys | unchanged |
| 7. operator_runtime | dot_iu_command_run + dot_iu_runtime_lease | unchanged |
| 8. piece_event_runtime | fn_iu_piece_event_runtime_healthcheck() flattened to a 9-column row |
new |
8th-surface verdict logic
def _verdict_piece_event_runtime(row):
ok = bool(row.get("ok"))
if not ok:
return False, "piece_event_runtime substrate integrity failure"
return True, (
f"emit_enabled={row['emit_enabled']} "
f"dry_run_only={row['dry_run_only']} "
f"types={row['event_types']}"
)
The verdict treats ok=true (substrate intact) as healthy regardless of gate state — the gate being false is the desired production state; flipping it open is an operator-driven phase, not an unhealthy condition.
If the substrate is broken (e.g. someone manually drops the trigger), ok=false is returned and the entire healthcheck exits with code 2 (FAIL).
In-DB verdict shape
{
"surface": "piece_event_runtime",
"ok": true,
"view_exists": true,
"fn_emit_exists": true,
"fn_trg_exists": true,
"trigger_exists": true,
"gate_keys": 2,
"piece_event_types_active": 6,
"emit_enabled": "false",
"dry_run_only": "true"
}
Failure modes detected:
view_exists=false(someone DROPpedv_piece_event_outbox)fn_emit_exists=false(someone DROPpedfn_iu_piece_emit_event)fn_trg_exists=false(someone DROPpedfn_iu_lifecycle_log_emit_piece_event_trg)trigger_exists=false(someone DROPped the trigger)gate_keys<>2(someone DELETEd a piece_event_runtime.* config row)piece_event_types_active<>6(someone retired one of the 6 piece.* events)
Any one of these returns ok=false.
Tests updated for the 8th surface
| Test | Bumped from → to |
|---|---|
test_iu_core_5000x_healthcheck.py::TestSurfaceCoverage::test_expected_surface_count |
7 → 8 |
test_iu_core_5000x_healthcheck.py::TestHappyPath::test_all_surfaces_ok |
uses HAPPY_RESULTS including new piece_event_runtime happy row |
test_iu_core_5000x_healthcheck.py::TestHappyPath::test_json_output_shape |
decoded surfaces count 7 → 8 |
tests/test_iu_core_piece_event_runtime.py::TestHealthcheckEighthSurface |
2 new tests on the verdict function + module docstring |
All passing.
Mac cron pilot impact
The 11000x healthcheck still uses the same iu_core_healthcheck_wrapper.sh from ops/healthcheck-cron-package/scripts/. The wrapper invokes python3 -m cutter_agent.iu_core.healthcheck which now reports 8 surfaces. No script changes needed.