KG/dot-kg Discovery — 06 Process Discovery Engine v1 Design
06 — Process Discovery Engine v1 Design (Workstream E)
Generic engine for discovering, scoring, and governing existing automated workflows from live evidence. dot-kg is the pilot, not the hardcode: the design and the live views already cover 17 candidates across 14 DOT families + job pipelines + workflows.
6.0 Principle
Owner-accepted lifecycle: Scan → Evidence graph → Orphan check → Score/verify → Label → Birth admission → Governance handoff → RP publish → Continuous drift. Engine proposes, never canonizes (golden rule, Đ39). Live-evidence-wins. No candidate is "verified" without runtime+correlation.
6.1 Stage 1 — Discovery scan
Sweep the sources that hold process evidence, generically:
- Components:
dot_tools(Type-1),job_queue.job_kind(Type-2 steps),workflows(Type-3). - Triggers:
dot_tools.trigger_type+cron_schedule; PG triggers (pg_trigger);job_queuelease. - Runtime:
dot_iu_command_run,job_queuerows (picked/finished/attempts),dot_tools.last_executed. - Events:
event_type_registry. - Content:
information_unit(SOP/spec IUs),knowledge_documents. - Governance:
governance_registry,governance_object_ownership,approval_requests. - Health/logs:
kg_quality_log,last_error, docker logs (out-of-band). Clustering keys (generic, not per-process): dots bysplit_part(code,'_',2); jobs bysplit_part(job_kind,'.',1); workflows byprocess_code.
6.2 Stage 2 — Evidence graph
One node per component with: component_ref, name, kind, candidate_code, sub_domain, trigger_type, paired_with, declared_engine, role_hint (producer/verifier/start/step/end/root/child), has_runtime_evidence, in_process_inventory, source_table. Live: v_process_discovery_evidence_graph (113 rows). Edges: pairing (paired_with), parent (parent_workflow_id), and—future—correlation (shared run id), sequence (job_kind order / cron order).
6.3 Stage 3 — Orphan / uncovered scan
Three orphan classes, generic:
ORPHAN_PRODUCER_PAIR_COVERED— component absent from inventory while its pair is present (the dot-kg producer case, ×18).ORPHAN_UNCOUNTED_DOT/ORPHAN_UNCOUNTED_COMPONENT— component not in inventory at all. Live:v_process_discovery_orphan_components(84 rows). Also: candidate-missing-components (a producer with no verifier, or vice-versa) and component-orphan (a job step never reached). This stage exists because the existingv_axis_process_inventorysilently dropson-demandDOTs — the engine's first job is to expose that blind spot.
6.4 Stage 4 — Quality scoring / verification
Per candidate, five independent evidence axes (weights):
| axis | signal | weight |
|---|---|---|
| start | has producer/start/root component | 20 |
| end | has verifier/end component | 20 |
| cross-component correlation | ≥2 components share a run id (e.g. job_queue.run_id) |
25 |
| runtime evidence | any component has executed | 25 |
| pairing/structure | any paired component | 10 |
confidence_score = Σ (0–100). Live: v_process_discovery_quality_score. Correlation + runtime weighted highest because they are exactly what distinguishes a real running process from a declared one — the gap the whole DOT layer currently sits in (0 executions). False-grouping risk is mitigated by clustering on stable naming keys and reporting members_orphaned so over/under-grouping is visible. |
6.5 Stage 5 — Label / classification
- Type: TYPE_1_DOT_CONTAINED · TYPE_2_AUTOMATED_MULTI · TYPE_3_HUMAN_IN_LOOP (from source_table + structure).
- Readiness band:
not_a_process(≤1 member) ·weak_candidate·strong_candidate_structural(start+end, no runtime) ·verified_candidate(start+end+runtime+correlation). Live in quality_score. No candidate may skip to verified without runtime+correlation — enforced by the band logic, not by reviewer opinion.
6.6 Stage 6 — Birth admission (proposal only)
Only verified_candidate → BIRTH_READY_PENDING_OWNER; everything else → a specific BLOCKED_* status with the missing evidence named. Live: v_process_discovery_birth_ready_queue. Birth itself (insert process definition / components / relations, reusing the MOW substrate per the AX-PROCESS pilot) is owner-gated and out of engine scope — the engine produces the queue, a human admits. No blind birth; births are unretirable, so the gate is advisory-strict.
6.7 Stage 7 — Governance handoff
For each birth-ready candidate, emit the governance packet fields: owner (governance_registry / governance_object_ownership), lifecycle policy (Đ34 draft→active→deprecated→retired), health policy (which verifier/HEALTH DOT, what SLA), escalation, change policy (workflow_change_requests / approval_requests with action='review' — never 'add', which auto-approves + births). Today every candidate is OWNER_MISSING (ownership table empty system-wide).
6.8 Stage 8 — Registries-Pivot publish
Separate candidate vs official surfaces. Candidates render from the discovery views under AX-PROCESS as report-only (PIV-3xx style), with warning flags (members_orphaned, birth_gate_status, no_runtime_evidence) visible. Official processes only after birth. Nuxt renders the contract; PG computes all counts/scores (no Nuxt math). See doc 09.
6.9 Stage 9 — Continuous drift update
Live: v_process_discovery_drift_signals computes per-candidate component_hash, trigger_hash, route_hash. Drift detection = compare current hashes to a stored baseline:
- new/removed component →
component_hashchanges; - trigger reconfigured →
trigger_hashchanges; - role/sequence reorder →
route_hashchanges. On change, raise asystem_issue/change_request(proposal). v1 computes the hashes; v2 adds aprocess_discovery_baselinetable (the one new table the engine needs) to persist last-seen hashes and diff automatically.
6.10 Required tables / views / functions
- Live now (read-only): the 6
v_process_discovery_*views (doc 07). - Needed for full lifecycle (future, owner-gated):
process_discovery_baseline(persist hashes for drift) — only genuinely new table.- Reuse AX-PROCESS substrate for births (no island table):
axis_registry,fn_process_node_substrate,v_axis_process_pivots. - A correlation column — see §6.11.
event_type_registryrows for KG (declare the producer events).
6.11 Correlation_id / process_run_id (the critical addition)
Today: dot_iu_command_run.run_id is per-command; job_queue.run_id correctly groups a pipeline run (why job:cut scores verified). Gap: DOT-contained processes have no cross-DOT correlation — you cannot stitch EXTRACT→VALIDATE into one process run. Proposal (design only, see next macro): add a nullable process_run_id (uuid) to dot_iu_command_run (and a correlation_id convention across emitted events), populated when a process orchestrates multiple DOTs. This is the single highest-leverage change to move TYPE_1 families from strong_candidate_structural to verified_candidate.
6.12 Hermes / Agent double-check fit
The KG family already embeds double-check structurally: every producer (Cấp B) is paired with a verifier (Cấp A); "Cấp A IDLE = producer correct." Map this onto the engine: the verifier DOT is the in-process Hermes check; the discovery engine is the meta-Agent double-check that audits whether both halves are present, correlated, and running. Adversarial verification at the engine level = the orphan scan (does every covered verifier have its producer counted?) + the correlation check (do paired runs actually link?).
6.13 What can be automated vs must stay review
- Automate: stages 1–5 + 8–9 (scan, evidence graph, orphan scan, scoring, labelling, publish, drift) — all read-only, all live or apply-ready.
- Review/owner-gated: stage 6 birth admission, stage 7 governance handoff (ownership, lifecycle, change policy), and any write to
event_type_registry/ correlation columns. The engine never crosses from proposing to enacting.
6.14 No hardcoding
Every rule keys off generic columns (trigger_type, paired_dot, run_id, job_kind, process_code) and naming-convention splits, not literal kg/dot-kg strings. Proven live: the same views score dot:nrm, dot:doc, dot:kb, job:cut, wf:WF-001, etc., with no KG-specific branch.