KB-1B63

Process Discovery — 02 Runtime/Correlation Model

5 min read Revision 1
process-discoverycorrelationruntimeprocess-run-idmodel2026-06-04

02 — Runtime / Correlation Model (Workstream A)

The canonical runtime identity model for process discovery. Design only; the durable substrate is in doc 04.

2.1 Identity fields

field type grain meaning
process_run_id uuid one per process run the run of a whole discovered process
correlation_id text shared across all components of a run the wire-level join key; reuse event_outbox.correlation_id convention
component_run_id uuid one per component execution a single DOT/job-step/workflow-step run
parent_run_id uuid nesting parent process run for nested/sub-process runs
process_candidate_code text pre-canon e.g. PROC-CAND:dot:kg
process_definition_code text post-canon set only after owner birth
dot_code text component DOT code / job_kind / workflow step ref
event_code text component maps to event_type_registry.event_type (process.*)
queue_job_id uuid component link to job_queue.job_id when a step is queue-backed
input_ref / output_ref text component provenance of consumed/produced artifacts
status text both observed/started/step_started/step_completed/completed/failed/stuck/cancelled/orphan
started_at / ended_at timestamptz both real wall-clock
error_ref text both pointer to error/last_error/system_issue
evidence_ref jsonb both structured proof (rows touched, hashes, gate snapshot)
source_system text both dot_runtime / job_queue / workflow_engine
idempotency_key text both dedupe; mirrors job_queue.idempotency_key

2.2 Rules (the contract)

  1. One process_run_id per process run. Minted by whoever orchestrates the run.
  2. One component_run_id per component execution. Always references its process_run_id (or is explicitly standalone with process_run_id IS NULL).
  3. Every DOT execution belongs to a process_run_id or is explicitly standalone. No anonymous runs.
  4. Producer/verifier pairs share correlation_id. This is what lets EXTRACT→VALIDATE stitch into one process run — the single change that moves TYPE_1 DOT families from runtime_missing to verified_candidate.
  5. Nested runs use parent_run_id. A workflow that fans out to DOT sub-runs links children up.
  6. Events carry correlation_id (and process_run_id in safe_payload). event_outbox already has the column; emitters populate it.
  7. No-hardcode. All keys are generic (process_run_id, correlation_id, dot_code, run_id, job_kind, process_code). Nothing is KG-specific.

2.3 Where it lives — decision

Three options were evaluated:

  1. Extend dot_iu_command_run — ❌ wrong grain. That table is the IU-command governance ledger (plan/apply/verify of governed commands), keyed by command_name, not DOT executions. Overloading it conflates two concerns.
  2. Events-only (event_outbox/event_pending) — ❌ insufficient alone. Events are notifications, transient and lossy; they carry correlation_id but are not a queryable run ledger (no durable run header, no status lifecycle).
  3. New process_run_observation + process_component_observation ledger — ✅ chosen. A durable, queryable header+detail ledger that mirrors the proven job_queue shape (run_id, idempotency, status, timestamps, error) and reuses event_outbox.correlation_id as the wire key.

Recommended hybrid: observation ledger = source of truth for runs; event_outbox = the emission/notification lane (carrying the same correlation_id); job_queue.run_id continues to back queue pipelines and feeds the same correlation_id (already verified for job:cut). DDL in doc 04.

2.4 How verification flips on

A candidate becomes verified_candidate when, for one real run, the ledger shows: a process_run_id header with completed status, ≥2 component rows sharing that run's correlation_id (start + end / producer + verifier), and started_at/ended_at set. job:cut already meets the equivalent bar via job_queue.run_id — the model generalises that exact proof to the DOT layer.

2.5 Anti-fake rule

A run is real only if the ledger row was written by an actual execution (or a sanctioned dry-run flagged source_system='dry_run' and evidence_ref.dry_run=true). Discovery scoring must exclude dry-run rows from the verified_candidate bar — they prove the wiring, not the process. last_executed is never accepted as runtime (it is a backfill).

Back to Knowledge Hub knowledge/dev/reports/architecture/process-discovery-correlation-runtime-inventory-fix-2026-06-04/02-runtime-correlation-model.md