KB-539B

6000x — Monitoring + Retention Productization

3 min read Revision 1
iu-core6000xmonitoringretentionstale-detector2026-05-24

6000x — Monitoring + Retention Productization

Date: 2026-05-24 Status: monitoring DONE (Mac pilot re-verified + stale_detector authored + Linux-timer package authored); retention AUTHOR_MODE (dry-run re-run; gate stays false).

Mac cron pilot re-verification

crontab -l confirms entry preserved:

*/10 * * * * /Users/nmhuyen/iu-cutter-build/repo/iu-cutter/ops/healthcheck-cron-package/scripts/iu_core_healthcheck_wrapper.sh >/dev/null

Tail of healthcheck.jsonl (4 fires before host slept):

2026-05-24T02:20:00Z  exit:0  overall_ok:true  7/7
2026-05-24T02:30:00Z  exit:0  overall_ok:true  7/7
2026-05-24T02:40:00Z  exit:0  overall_ok:true  7/7

Manual fire from this macro: 7/7 GREEN, exit:0.

Stale-detector finding (limitation made visible)

ops/monitoring-productization/stale_detector.sh run from this macro:

STALE: last healthcheck 25702 seconds ago (> 1800)
exit=2

That is the Mac-sleep limitation working as designed — cron missed firings while host was asleep. Productization delta: previously silent, now surfaced.

Productization package

ops/monitoring-productization/:

  • README.md — pilot vs production matrix
  • stale_detector.sh — 30-min gap detector
  • alert_on_fail.sh — wrapper with caller-provided sink (syslog | file:PATH | webhook:URL)
  • install_mac_hardened.sh — idempotent install
  • uninstall.sh

Linux user-timer install for always-on host: referenced; not implemented (no user-level authority granted from inside the chain).

Retention substrate re-verification

Live fn_iu_core_retention_cleanup('iu_core_6000x_lifecycle_qdrant_ops_dryrun', true):

target_table cutoff rows_eligible rows_deleted
dot_iu_command_run 2026-02-23 0 0
iu_three_axis_envelope_refresh_log 2026-04-24 0 0
iu_three_axis_envelope_trigger_error_log 2026-02-23 0 0

Gate: iu_core.retention_enabled | false. No row deleted. No gate flipped.

Retention enablement package

ops/retention-productization/README.md — 5 enablement gates, flip sequence, rollback. Earliest realistic flip date: 2026-06-22 (refresh_log id 18 ages past 30-day cutoff).

Back to Knowledge Hub knowledge/dev/laws/dieu44-trien-khai/v0.6-iu-core-6000x-lifecycle-qdrant-ops-productization-open-goal/04-monitoring-and-retention-productization-2026-05-24.md