Process Discovery — 02 Runtime/Correlation Model
02 — Runtime / Correlation Model (Workstream A)
The canonical runtime identity model for process discovery. Design only; the durable substrate is in doc 04.
2.1 Identity fields
| field | type | grain | meaning |
|---|---|---|---|
process_run_id |
uuid | one per process run | the run of a whole discovered process |
correlation_id |
text | shared across all components of a run | the wire-level join key; reuse event_outbox.correlation_id convention |
component_run_id |
uuid | one per component execution | a single DOT/job-step/workflow-step run |
parent_run_id |
uuid | nesting | parent process run for nested/sub-process runs |
process_candidate_code |
text | pre-canon | e.g. PROC-CAND:dot:kg |
process_definition_code |
text | post-canon | set only after owner birth |
dot_code |
text | component | DOT code / job_kind / workflow step ref |
event_code |
text | component | maps to event_type_registry.event_type (process.*) |
queue_job_id |
uuid | component | link to job_queue.job_id when a step is queue-backed |
input_ref / output_ref |
text | component | provenance of consumed/produced artifacts |
status |
text | both | observed/started/step_started/step_completed/completed/failed/stuck/cancelled/orphan |
started_at / ended_at |
timestamptz | both | real wall-clock |
error_ref |
text | both | pointer to error/last_error/system_issue |
evidence_ref |
jsonb | both | structured proof (rows touched, hashes, gate snapshot) |
source_system |
text | both | dot_runtime / job_queue / workflow_engine |
idempotency_key |
text | both | dedupe; mirrors job_queue.idempotency_key |
2.2 Rules (the contract)
- One
process_run_idper process run. Minted by whoever orchestrates the run. - One
component_run_idper component execution. Always references itsprocess_run_id(or is explicitly standalone withprocess_run_id IS NULL). - Every DOT execution belongs to a
process_run_idor is explicitly standalone. No anonymous runs. - Producer/verifier pairs share
correlation_id. This is what lets EXTRACT→VALIDATE stitch into one process run — the single change that moves TYPE_1 DOT families fromruntime_missingtoverified_candidate. - Nested runs use
parent_run_id. A workflow that fans out to DOT sub-runs links children up. - Events carry
correlation_id(andprocess_run_idinsafe_payload).event_outboxalready has the column; emitters populate it. - No-hardcode. All keys are generic (
process_run_id,correlation_id,dot_code,run_id,job_kind,process_code). Nothing is KG-specific.
2.3 Where it lives — decision
Three options were evaluated:
- Extend
dot_iu_command_run— ❌ wrong grain. That table is the IU-command governance ledger (plan/apply/verify of governed commands), keyed bycommand_name, not DOT executions. Overloading it conflates two concerns. - Events-only (
event_outbox/event_pending) — ❌ insufficient alone. Events are notifications, transient and lossy; they carrycorrelation_idbut are not a queryable run ledger (no durable run header, no status lifecycle). - New
process_run_observation+process_component_observationledger — ✅ chosen. A durable, queryable header+detail ledger that mirrors the provenjob_queueshape (run_id, idempotency, status, timestamps, error) and reusesevent_outbox.correlation_idas the wire key.
Recommended hybrid: observation ledger = source of truth for runs; event_outbox = the emission/notification lane (carrying the same correlation_id); job_queue.run_id continues to back queue pipelines and feeds the same correlation_id (already verified for job:cut). DDL in doc 04.
2.4 How verification flips on
A candidate becomes verified_candidate when, for one real run, the ledger shows: a process_run_id header with completed status, ≥2 component rows sharing that run's correlation_id (start + end / producer + verifier), and started_at/ended_at set. job:cut already meets the equivalent bar via job_queue.run_id — the model generalises that exact proof to the DOT layer.
2.5 Anti-fake rule
A run is real only if the ledger row was written by an actual execution (or a sanctioned dry-run flagged source_system='dry_run' and evidence_ref.dry_run=true). Discovery scoring must exclude dry-run rows from the verified_candidate bar — they prove the wiring, not the process. last_executed is never accepted as runtime (it is a backfill).