R2a — Birth Inspection Runner / Cron / Log Root-Cause Study (2026-06-18)
R2a — Birth Inspection Runner / Cron / Log Root-Cause Study
Date: 2026-06-18 · Workstream: R2a (read-only runner/cron/log root-cause study, run ∥ R1a, after accepted R1/R2 read-only scoping baseline + Codex PASS_WITH_CAVEATS) · Revision: rev1
Class: read-only root-cause study / runtime-evidence / Owner-decision-prep
READ-ONLY · NON-ENACTING · NON-AUTHORIZING · NOT remediation · NOT technical design · NOT implementation · NO blocker resolved · NO restart · NO certify/inspect/birth/promote execution.
0. Status and non-authorization
STATUS: PASS — the R2a root cause is identified with direct, first-hand read-only evidence. R2's open question ("why did the 2026-03-21 inspection batch never recur; was it manual / cron / runner / migration; can the GUC values be confirmed") is now answered from the live PostgreSQL catalog, the DB-captured host-crontab snapshot, and the deployed birth-DOT producer scripts.
PASS here is an engineering root-cause statement only — not an Owner authorization to restart or remediate. This run did not restart the pipeline, did not set inspect_pen/inspect_stamp/inspect_gate, did not set certified=true, did not trigger birth/certify/promote, did not write the DB, did not patch source or any prior report; it created only the three allowed reports. Engineering verification ≠ Authority approval. Every blocker remains OPEN.
Non-authorization (explicit). This report does not and cannot: run any write/DDL/DML; restart/reload any container/service; run any worker/cron/job; set inspect_pen/inspect_stamp/inspect_gate; set certified=true; trigger birth / certify / promote / repair; re-run dot-birth-backfill / dot-inspect-pen; flip any dot_config gate; materialize BIRTH_STAMP/PROMOTE_STAMP / cell_id / canonical_fields; patch source / law / draft / note / prior report; create a current corpus; write technical design; change authority order (CONS-004); change the v0.1 baseline; promote v0.2-hardening.
1. Sources read
Same baseline set as R1a §1 (Phase-1B / R1 / R2 / R1-R2 exec / Phase-1 / Codex PASS_WITH_CAVEATS / incomex-rules.md / CLAUDE.md). R2 anchors carried: notes/dieu4-birth-process-compatibility-note.md, amendments/l4-birth-gate-extension-amendment-draft.md, architecture/birth-registry-law.md (Điều 0-G, dated 2026-03-21), laws/law-04-birth-process.md.
Codex caveat honored: Codex recommended "R2 needs read-only DOT-runner/cron/log study for the inspection-stage producer and the 2026-03-21 one-shot batch." This run executes exactly that, read-only.
Deployed producer source read first-hand (synced local mirror of the deployed /opt/incomex/dot/bin): web-test/dot/bin/dot-inspect-pen (204 lines, v1.0.0, 2026-03-21, "S157-A: Pen Inspection"), web-test/dot/bin/dot-birth-backfill (211 lines, v1.0.0, 2026-03-21, "S157-A: Backfill"), dot-birth-trigger-setup (275 lines) present. These are the inspection/certification producers (§5, §7).
2. Commands and evidence ledger
Read-only proof identical to R1a §2 (single-statement SELECT/catalog reads against directus via query_pg, READ ONLY txn, read-only role; docker_logs/list_docker tail/list only; local Read/ls/find/wc). Session window 2026-06-17 23:49 → 2026-06-18 01:30 UTC. No write/DDL/DML/execution/restart made or prepared.
Tool-boundary note (contingency). No crontab -l / systemctl / docker exec / docker inspect tool exists in the available surface. Host cron/systemd was inspected via the read-only wf_host_crontab_snapshot table (a DB-captured snapshot of the host crontab, observed 2026-06-17 02:10:04) — this is the authoritative read-only window into host cron. read_file is allowlisted to /opt/incomex/docs, /opt/incomex/dot/specs, /var/log/nginx only — env files and /opt/incomex/dot/bin are not readable on the VPS, so the producer scripts were read from the synced local mirror, and the live app.birth_gate_mode/app.bypass_birth_gate GUCs are inspected via pg_settings/pg_db_role_setting (§9). Container logs do not reach back to 2026-03-21 (tail-only, ≤500 lines) → the 2026-03-21 batch is established from birth_registry provenance + the producer script, not container logs.
| ID | Command (abbrev.) | Target | Read-only? | Exit | Used for |
|---|---|---|---|---|---|
| D0 | list_docker |
VPS Docker | yes | ok (11) | §3 |
| Q1 | pg_extension |
directus | yes | ok (4; no pg_cron) | §6 |
| Q4 | dot_tools birth-DOT detail (%BIRTH%) |
directus | yes | ok (5) | §5 |
| Q5/Q8 | run/queue/log table inventory + columns | directus | yes | ok (68/154) | §4 |
| Q9 | dot_config runner/queue/birth keys |
directus | yes | ok (19) | §4,§8 |
| Q10 | birth_registry certified GROUP BY origin |
directus | yes | ok (3) | §7 |
| Q11 | v_process_discovery_runner_status |
directus | yes | ok (17) | §4,§5 |
| Q12 | wf_host_crontab_snapshot (host cron) |
directus | yes | ok (54) | §6 |
| Q13 | queue_heartbeat |
directus | yes | ok (3) | §4,§8 |
| Q14 | recency/count union (9 run/log tables) | directus | yes | ok (9) | §8 |
| Q21 | event_outbox breakdown |
directus | yes | ok (8) | §8 |
| Q22 | pg_settings WHERE name LIKE 'app.%' |
directus | yes | ok (0) | §9 |
| Q23 | pg_db_role_setting |
directus | yes | ok (0) | §9 |
| G1–G3 | docker_logs postgres/directus/agent-data (tail 5) |
VPS | yes | ok | liveness baseline |
| E5–E8 | local find/ls/wc/Read of birth-DOT scripts |
local mirror | yes | ok | §5,§7 |
3. Container / service inventory for birth inspection
list_docker (D0) — relevant to birth/certify:
| Container | Image | Status | Role for birth-certify |
|---|---|---|---|
postgres |
postgres:16 | Up 2 months (healthy) | hosts birth_registry, trg_birth_auto_certify, 192 birth triggers, runner tables |
incomex-directus |
directus/directus:11.5 | Up 5 weeks (healthy) | CMS; births fire on its INSERTs via triggers |
incomex-agent-api-executor |
agent-api-executor-local:v1 | Up 13 days (healthy), :8090 |
generic agent/DOT dispatch runner — not bound to any birth DOT (§5) |
There is no dedicated birth-inspection service/container. The birth INSERT/auto-certify fabric lives inside postgres (triggers + fn_birth_auto_certify); the inspection producers are operator-run CLI scripts under /opt/incomex/dot/bin (§5/§7), not a deployed service. The deployed runtime root is /opt/incomex (containers + dot/bin + deploys/web-test); local web-test/ is the synced source of those scripts.
4. Runner / scheduler / worker discovery
Same runner substrate as R1a §4. Master switches OFF (process_dot_runtime.execute_enabled/real_run_enabled=false, dry_run_only=true; queue.worker.enabled=false; queue.job_substrate.enabled=false). Queue idle since 2026-05-26 (queue_heartbeat: 3 executors, none birth-related, last tick 2026-05-22→26). event_outbox grows live (215,588 rows, last 2026-06-18 01:00) but nothing drains it — and it carries no birth/certify events (§8).
Runner classification (Q11): PROC-CAND:dot:birth = 2 members (the BIRTH-VERIFY/GATE DOTs), runner_kind = engine_unclassified, dryrun_capability = requires_runner. No runner is classified or bound for the birth inspector DOTs — in contrast to PROC-CAND:job:cut (external_queue_runner) and dot:kg (mixed_engine_partial_runner). The birth inspector family is the least-wired of all.
5. DOT-TAC-BIRTH-VERIFY / GATE wiring check
dot_tools birth-DOT detail (Q4):
| code | tier | domain | operation | trigger_type | cron | file_path | execution_engine | coverage | last_executed |
|---|---|---|---|---|---|---|---|---|---|
DOT-TAC-BIRTH-VERIFY |
(null) | data_quality | verify | cron | 0 6 * * * |
NULL | (none) | partial | NULL |
DOT-TAC-BIRTH-GATE |
(null) | data_quality | gate | event | — | NULL | (none) | partial | NULL |
DOT_BIRTH_BACKFILL |
B | lifecycle | — | NULL | — | opt/incomex/dot/bin/dot-birth-backfill |
— | NULL | |
DOT_BIRTH_TRIGGER_SETUP |
B | lifecycle | — | NULL | — | opt/incomex/dot/bin/dot-birth-trigger-setup |
— | NULL | |
DOT_SCHEMA_BIRTH_REGISTRY_ENSURE |
B | collection | — | NULL | — | opt/incomex/dot/bin/dot-schema-birth-registry-ensure |
— | NULL |
Reading. DOT-TAC-BIRTH-VERIFY and -GATE are registration stubs only: no file_path, no script_path, no execution_engine, no agent-api contract (the only two contracts are the KG EXPLAIN pair — R1a §5), coverage_status=partial, last_executed=NULL. They are not wired to any runner (Q11: engine_unclassified/requires_runner). The cron='0 6 * * *' on DOT-TAC-BIRTH-VERIFY is a metadata field in dot_tools, not an installed schedule (§6 proves it is not in host cron and there is no pg_cron). The three DOT_BIRTH_* lifecycle DOTs point at /opt/incomex/dot/bin/* operator scripts (manual CLIs — §7), also never recorded as executed.
Is there an actual birth inspection runner? No. The inspector DOTs are unwired metadata stubs; the only inspection producers are manual CLI scripts (§7); the auto-certify consumer trigger is live but starved.
6. Cron / systemd / timer check
- pg_cron: NOT installed (Q1 —
pg_extension= btree_gist, pgcrypto, plpgsql, postgres_fdw only). So no in-database scheduler runs anydot_tools.cron_schedule. The0 6 * * */0 */6 * * *schedules indot_toolsare inert metadata. - Host crontab (Q12,
wf_host_crontab_snapshot, observed 2026-06-17 02:10:04, 54 entries, allstatus=OBSERVED): the0 6 * * *slot is bound to/opt/incomex/dot/bin/dot-nrm-lifecycle --local(NRM lifecycle), notDOT-TAC-BIRTH-VERIFY. Scanning all 54 entries: the host cron runs maintenance/scanner/backup DOTs only — e.g.dot-hc-executor(0 */3),dot-orphan-scanner,dot-misclass-scanner,dot-nrm-verify/sync/lifecycle/discover,dot-sync-orphan-scan(DOT-317),dot-trigger-guard(DOT-316),dot-matrix-health(DOT-315),dot-gov-verify,dot-vector-audit(DOT-089),dot-pivot-health(DOT-114),dot-accuracy-verify(DOT-152),fn_expire_stale_approvals(0 5),auto_apply_approval(30 4), matview-refresh, backups, certbot. No host cron entry references birth / inspect / certify /inspect_pen|stamp|gateor anykg.*DOT. Only 6 of 54 entries map to adot_toolscode (DOT-317/316/315/089/114/152 + DOT_GOV_VERIFY); none is a birth inspector. - systemd timers: not directly inspectable (no
systemctltool); however the host-crontab snapshot is the wiring channel actually used for scheduled DOTs, and several entries explicitly guard on/run/systemd/systemfor OS housekeeping only. No birth/inspect systemd unit is referenced anywhere in the captured wiring → NOT_FOUND for a birth-inspect timer (recorded as gap, not inferred presence).
Is cron 0 6 * * * wired outside dot_tools? Yes, but to dot-nrm-lifecycle, not to birth-verify. There is no scheduled wiring (host cron or pg_cron) for any birth inspection/certification job.
7. Logs around 2026-03-21 06:00–08:00 (the certification batch)
Container logs are tail-only and cannot reach 2026-03-21; there is also no birth-verify log path (no cron entry, §6). The batch is therefore reconstructed from (a) birth_registry provenance and (b) the producer script content — which agree exactly.
(a) Certified-row provenance (Q10): all 1,402 certified rows have user_created=NULL and split into three dot_origin buckets, all certified on 2026-03-21:
dot_origin |
rows | certified_at window | born_at window |
|---|---|---|---|
backfill:s157b |
1272 | 2026-03-21 06:28:36 → 06:28:54 | 2026-02-17 → 2026-03-21 06:28 |
backfill:dot-birth-backfill |
112 | 2026-03-21 06:00:38 | 2026-03-06 → 2026-03-17 |
| `SYSTEM-s157b | claude | 2026-03-21` | 18 |
(b) Producer script (dot-birth-backfill, v1.0.0, 2026-03-21, "S157-A"): a manual bash CLI (dot-birth-backfill [--cloud|--local] [--dry-run]) that authenticates, lists governed collections from collection_registry, then for each collection runs — via ssh -i contabo_vps root@38.242.240.89 → docker exec -i postgres psql — INSERT INTO birth_registry (… dot_origin, certified, certified_at, inspect_pen, inspect_stamp, inspect_gate) SELECT … 'backfill:dot-birth-backfill', true, now(), now(), now(), now() … ON CONFLICT (entity_code) DO NOTHING. i.e. the backfill stamps all three inspect_ columns and certified=true directly in the INSERT*. Its closing "NEXT STEPS" note reads: "1. dot-inspect-pen — Run pen inspection on uncertified births." The sibling dot-inspect-pen (also v1.0.0, 2026-03-21, "S157-A") is a manual CLI that runs a DO $$ loop over birth_registry WHERE inspect_pen IS NULL AND governance_role='governed', and on pass UPDATE birth_registry SET inspect_pen=now() (Phase A; species/stamp/gate explicitly deferred to "Phase B").
Conclusion (R2a Q5). The 2026-03-21 06:00–08:00 batch was a manual, one-shot S157-A bring-up — dot-birth-backfill (the backfill:* rows, stamping inspect_*+certified in the INSERT) plus an s157b seed step (backfill:s157b, SYSTEM-s157b|claude|2026-03-21) — executed by an operator over SSH+docker exec on the same day Điều 0-G (birth-registry-law) was authored. It was neither a cron job, a runner job, nor a database migration; it was an operator-run CLI bootstrap.
8. Logs / current evidence after 2026-03-21
- Births continue live, uncertified:
birth_registry= 1,402 certified (all 2026-03-21) vs 1,211,557 uncertified (0 inspect stamps; last born 2026-06-17 13:30 — R2 carry); 192 birth triggers (191 enabled) mintcertified=falserows continuously. - No producer ran since:
dot-inspect-pen/dot-birth-backfillhave no host-cron entry, no pg_cron, no runner binding;DOT-TAC-BIRTH-VERIFYlast_executed=NULL. There is no recurring inspection producer, soinspect_pen/stamp/gateare never set for new births. - Consumer is healthy but starved:
trg_birth_auto_certify → fn_birth_auto_certify(enabled) only flipscertified=trueonce all three inspect_* are present; with no producer, it never fires for the 1.21M post-cutover rows. - Queue/runner idle (Q13/Q14/Q21): no
queue_heartbeattick since 2026-05-26;job_queue/dot_iu_command_runidle since 2026-05-26/28;event_outboxgrows (215,227 of 215,588 rows aresystem/issue_opened, live to 2026-06-18 01:00) but is undrained and carries no birth/certify events — the IU pilot events (2026-05-22→27) are the only non-system events and are idle. So nothing downstream is poised to resume certification.
Why did it never recur? Because the inspect→certify producer was never operationalized into a recurring runner: the only producers are manual CLIs nobody scheduled, the inspector DOTs are unwired stubs, there is no pg_cron, and host cron has no birth entry. The 2026-03-21 batch was a bootstrap that was never converted into a standing process.
9. app.birth_gate_mode / app.bypass_birth_gate config evidence
R2 could not read these GUCs (query_pg denies current_setting() outside its safe-param allowlist). R2a confirms their persisted state from the catalog instead:
pg_settings WHERE name LIKE 'app.%'(Q22) → 0 rows — neitherapp.birth_gate_modenorapp.bypass_birth_gateis a server-level /postgresql.confcustom GUC.pg_db_role_setting(Q23) → 0 rows — noALTER DATABASE/ALTER ROLEpersisted default for any GUC.
Reading. There is no persisted value for either GUC at server, database, or role scope. Per the fn_birth_gate body (Phase-1 §6.4 carry), it reads current_setting('app.birth_gate_mode', true) (missing_ok) defaulting to 'warning' and current_setting('app.bypass_birth_gate', true) → bypass only if 'true'/'1'. With no persisted default and no env-file readable (allowlist), the effective birth-gate mode is warning (fail-open warn-mode) and the bypass kill-switch is NOT engaged by any persisted config. The only way a different live value could exist is a transient session-level SET, which query_pg cannot read — that residual remains an EVIDENCE_GAP, but there is no persisted bypass and no config/env source that sets one. (The producer scripts dot-inspect-pen/dot-birth-backfill do not set these GUCs.)
Can the GUC values be confirmed from config/env? Partially: the persisted (server/db/role) layer is confirmed empty (no bypass). The transient session layer is not readable via the available tools → residual gap; confirming it requires an out-of-band check the Owner controls.
10. Root-cause verdict
Verdict: The birth-certification pipeline stalled because it was never a standing pipeline. The 2026-03-21 certification was a one-shot, operator-run S157-A bootstrap (dot-birth-backfill + s157b seed via SSH+docker exec) that stamped inspect_*+certified directly; it never recurred because the inspect→certify producer was never operationalized into a runner — the inspector DOTs (DOT-TAC-BIRTH-VERIFY/GATE) are unwired metadata stubs (no file/script, no contract, engine_unclassified/requires_runner), there is no pg_cron, the host 0 6 * * * slot belongs to dot-nrm-lifecycle, and the only producers are manual CLIs nobody scheduled. The auto-certify consumer trigger is healthy but starved. (Confidence: High.)
Precisely:
- Is there an actual birth inspection runner? No — only manual CLI producers (
dot-inspect-pen,dot-birth-backfill) + an unwired registration stub (DOT-TAC-BIRTH-VERIFY/GATE); no service, no scheduler. - Is cron
0 6 * * *wired outsidedot_tools? Yes, todot-nrm-lifecycle, not birth-verify; no pg_cron; no birth entry anywhere in host cron. - Was the 2026-03-21 batch manual / cron / runner / migration / unknown? Manual — operator-run S157-A backfill+seed via SSH+
docker exec(provenance tokensbackfill:dot-birth-backfill,backfill:s157b,SYSTEM-s157b|claude|2026-03-21; script content matches exactly). - Why did it never recur? The producer was never wired to a recurring runner/cron; inspector DOTs are stubs; no pg_cron; host cron has no birth job.
- Can the GUC values be confirmed from config/env? Persisted layer confirmed empty (no bypass; warn-mode default); transient session value not readable (residual gap).
- What must happen before write-enabled R2? See §13.
Consistent with R2-F1/F2/F3/F4 and Phase-1B (HOLD-2 PARTIAL); R2a converts R2's "single 2026-03-21 batch, producer absent" into a concrete cause: manual bootstrap, never operationalized. No contradiction.
11. Findings
| ID | Severity | Summary | Blocks write-enabled remediation? |
|---|---|---|---|
| R2a-F1 | HIGH | No standing inspection producer/runner: dot-inspect-pen/dot-birth-backfill are manual SSH+docker exec CLIs; DOT-TAC-BIRTH-VERIFY/GATE are unwired stubs (engine_unclassified/requires_runner, no file/contract, last_executed=NULL). |
Yes — R2 must design+build a real producer/runner before certify can resume |
| R2a-F2 | HIGH | The 2026-03-21 certification was a one-shot manual S157-A bootstrap (dot-birth-backfill stamps inspect_*+certified in the INSERT; backfill:* / SYSTEM-s157b|claude|2026-03-21 origins) — not a cron/runner/migration; hence non-recurring by construction. |
Yes — confirms there is no pipeline to "restart," only one to build |
| R2a-F3 | HIGH | Cron not wired: no pg_cron; host 0 6 * * * = dot-nrm-lifecycle; no host-cron / systemd entry references birth/inspect/certify; dot_tools.cron_schedule is inert metadata. |
Yes — scheduling channel must be chosen+wired under Owner authority |
| R2a-F4 | HIGH | Backlog grows: 1,211,557 uncertified births (live to today) via 192 triggers default certified=false; auto-certify consumer enabled but starved; event_outbox undrained, no birth events. |
Yes (birth-dependent / canonical-dependent TD) |
| R2a-F5 | MEDIUM/INFO | GUC persisted layer empty (pg_settings app.%=0, pg_db_role_setting=0) → birth-gate effective mode warning, bypass not engaged by persisted config; transient session value unreadable (residual gap). |
No (informs Owner GUC decision; not a runner blocker) |
| R2a-G1 | INFO (gap) | Container logs cannot reach 2026-03-21 (tail-only) and /opt/incomex/dot/bin + env files are outside the read_file allowlist → producer scripts read from synced local mirror; batch reconstructed from provenance + script (which agree). |
No (does not change verdict) |
No CRITICAL. No active mutation, certify, inspect-write, or birth execution observed. No finding marked resolved.
12. What remains blocked
Every blocker stays OPEN: HOLD-2 birth/stamp path PARTIAL; R2-F1/F2/F3/F4 (HIGH, birth-dependent) carry forward. Birth-dependent / canonical-dependent technical design remains GATED. Forbidden until the Owner opens a write-enabled R2 workstream: setting inspect_pen/stamp/gate; setting certified=true; re-running dot-birth-backfill/dot-inspect-pen; building/wiring an inspection runner or cron; certify/promote execution; BIRTH_STAMP/PROMOTE_STAMP materialization; Điều 0-G schema change. CONS-002/003 + CELL-003/004/007 and the Điều 0-G source-recovery item remain prerequisites to any R2 materialization.
13. Owner decisions required
- R2a-OD-1 (carried from R2-OD-a, now answered → decision): the read-only root-cause of the inspection-stage producer is complete — it is a manual bootstrap, never operationalized. The Owner decides whether to open a write-enabled R2 that (separately, governed) designs + builds a standing inspect→certify producer/runner (choosing a channel: host cron / pg_cron / agent-api executor / job_queue worker) and wires
DOT-TAC-BIRTH-VERIFY. No build, restart, or backfill is authorized here. - R2a-OD-2 (backlog disposition): decide how the 1,211,557 uncertified births are handled once a producer exists (re-run
dot-inspect-penstyle inspection vs a re-architected gate) — to be designed, not built, after the package opens. - R2a-OD-3 (GUC): confirm
app.birth_gate_mode/app.bypass_birth_gateout-of-band (transient session layer not readable here) and decide the warn→block criteria. - OD-8 (carried): confirm CONS-002/003 + CELL-003/004/007 + Điều 0-G source-recovery remain prerequisites to any R2 materialization.
Restart / repair / producer-build / backfill / inspect-writes / certify = later, separate, write-enabled Owner workstreams. Engineering/Codex PASS ≠ Owner authorization.
14. Next recommended action
- GPT reviews this R2a report (with R1a and the combined execution report).
- If accepted, Codex adversarial control review.
- Owner decides R2a-OD-1/2/3 + OD-8: open write-enabled R2 (design+build a standing inspection producer/runner under governance), keep read-only, or sequence R2 relative to R1.
Default disposition: HOLD. PASS = engineering root cause identified; it is not Owner authorization. No blocker resolved; no inspect/certify writes; no producer built or restarted; TD remains gated.