KB-1C9F

Ref Grammar v0.1 — WF Procedure Index join key

9 min read Revision 1
workflow-manageprocedure-indexref-grammardesignjoin-keyv0.12026-06-23

Ref Grammar v0.1 — WF Procedure Index

Path: knowledge/dev/laws-new/workflow-manage/design/ref-grammar-v0.1.md Status: DESIGN · v0.1 · Non-authorizing · no PG object created Date: 2026-06-23 Why this is the most important design artifact: Readiness View joins Ingredient Map.ingredient_ref against Inventory View.object_ref. If the two sides spell a reference differently, PG reports a false missing. Ref grammar is the single normalization contract both sides MUST speak. It is a naming convention so PG can read PG — not a new registry.


1. Canonical shape

<kind>:<identifier>
  • <kind> — lowercase token from the closed kind list (§3). Always lowercased on write and on read.
  • : — single ASCII colon separator (first colon splits kind from identifier).
  • <identifier> — kind-specific sub-grammar. May itself contain dots (.) for schema-qualified kinds.

Stored once, normalized, in wf_procedure_ingredient.ingredient_ref and emitted identically by wf_inventory_current.object_ref.


2. Two normalization classes (this is the crux)

PostgreSQL folds unquoted identifiers to lowercase, but governed CODES are case-bearing. So refs split into two classes with different rules:

Class CATALOG — identifier is a PG object path

Kinds: collection, view, field, trigger, function.

  • Lowercase the whole identifier (PG already folds unquoted names).
  • Schema required-or-defaulted: if omitted, default public. Store the resolved, schema-qualified form (collection:public.dot_tools, never collection:dot_tools).
  • Existence is proven against information_schema / pg_catalog — always real-time fresh.

Class CODE — identifier is a governed code

Kinds: dot, approval, label, procedure, event, io, checker, template.

  • Case-SENSITIVE — store exactly as the SSOT stores it (DOT codes are UPPER DOT_KG_EXPLAIN/DOT-063; action codes are lower_snake patch_ops_code; taxonomy codes are mixed). A case mismatch = a real miss, by design (do not auto-fold).
  • No schema. Optional facet/namespace qualifier where the code is not globally unique (label:<facet>.<code>, event:<domain>.<type>).
  • Existence is proven against the registry table for that kind.

Rule: if a writer cannot confidently produce a normalized ref, store the raw string and set ref_status='UNNORMALIZED'. Readiness then yields INVALID_REF — never a silent join.


3. Per-kind grammar table (v0.1)

kind class canonical pattern PG existence probe (source confirmed live) case schema/ns fallback if not found valid examples invalid examples
dot CODE dot:<CODE> EXISTS(SELECT 1 FROM dot_tools WHERE code=$1) sensitive none MISSING dot:DOT_KG_EXPLAIN, dot:DOT-063 dot:dot_kg_explain (wrong case→miss), DOT_KG_EXPLAIN (no prefix)
collection CATALOG collection:<schema>.<table> information_schema.tables OR directus_collections lower default public MISSING collection:public.dot_tools, collection:dot_toolspublic.dot_tools collection:DotTools (will fold; OK but discouraged), table:... (wrong kind)
view CATALOG view:<schema>.<view> information_schema.views (+ matview via pg_class relkind m) lower default public MISSING view:public.v_qt001_apply_readiness_dashboard_v8 view:dot_tools (it's a table)
field CATALOG field:<schema>.<table>.<column> information_schema.columns lower default public MISSING field:public.dot_tools.code field:dot_tools (missing column part)
trigger CATALOG trigger:<schema>.<table>.<trigger_name> information_schema.triggers / pg_trigger lower default public MISSING trigger:public.birth_registry.trg_birth_gate trigger:trg_birth_gate (no table)
function CATALOG function:<schema>.<name> pg_proc join pg_namespace (name only) lower default public MISSING function:public.fn_birth_gate
approval CODE approval:<action_code> EXISTS(SELECT 1 FROM apr_action_types WHERE action_code=$1) sensitive none MISSING approval:patch_ops_code, approval:authorize_build_step approval:PatchOpsCode
event CODE event:<event_domain>.<event_type> event_type_registry(event_domain,event_type) sensitive domain ns MISSING event:<domain>.<type> event:sometype (no domain)
label CODE label:<code> or label:<facet>.<code> taxonomy.code (facet-scoped) sensitive optional facet AMBIGUOUS→metadata, else MISSING label:<code>
procedure CODE procedure:<code> wf_procedure.procedure_code first, then workflows.process_code sensitive none MISSING procedure:PROC_CREATE_NEW_DOT, procedure:WF-001
io CODE io:<code> dot_agent_api_contract.dot_code only (narrow) sensitive none UNKNOWN_SOURCE (no general IO model) io:DOT_KG_EXPLAIN
checker CODE checker:<code> optional probe universal_rule_registry/checkpoint_types sensitive none UNKNOWN_SOURCE (no SSOT) checker:<code>
template CODE template:<code> unconfirmed source sensitive none UNKNOWN_SOURCE (NEEDS_OWNER_DECISION) template:dot_spec_template

Kinds whose source is unknown in v0.1 (io, checker, template, plus report) always resolve to UNKNOWN_SOURCE, never MISSING. That distinction is load-bearing: MISSING means "we know where to look and it isn't there → go create it"; UNKNOWN_SOURCE means "we don't yet know how to check → triage the source, do not fabricate."


4. collection vs view vs physical table

  • collection: covers logical Directus collection AND physical PG table — probe both; record in metadata_jsonb which surface matched ({"logical":true,"physical":true}). A Directus collection without a physical table, or vice-versa, is still EXISTS but flagged in metadata.
  • view: is reserved for relkind in ('v','m'). Asking collection: for a view (or view: for a table) is a soft mismatch → resolve to the surface that exists, set notes='kind_surface_mismatch'.

5. Known weak spots (must be designed around, not ignored)

  1. function: overloading. 618 functions; names can be overloaded. v0.1 existence = ≥1 proc of that name in the schema; do NOT resolve argument signature. Set metadata_jsonb.overloaded=true when count>1. A precise signature grammar (function:schema.name(argtypes)) is deferred to v0.2.
  2. label: facet ambiguity. taxonomy.code is scoped by facet_id; a bare code may match multiple facets. Prefer label:<facet>.<code>; if bare and multi-match → status AMBIGUOUS surfaced in metadata, treated as EXISTS-with-warning (not MISSING).
  3. procedure: dual namespace. Self-codes (PROC_*, future) vs legacy business workflows (WF-*). Probe order: wf_procedure then workflows. Never merge the two code spaces.
  4. io/checker/template/report have no clean PG SSOT today — they are deliberately UNKNOWN_SOURCE. Do not invent a source to make readiness look green.

6. Reference resolver contract (for Inventory/Readiness)

A single SQL helper (function or inline CASE) maps (kind, identifier) → exists_bool using ONLY the probes in §3. Properties:

  • Deterministic & side-effect-free (pure SELECT).
  • Bounded: probes one ref at a time; the hot path resolves only the handful of refs belonging to ONE queried procedure (see performance model).
  • Honest tri-state: returns EXISTS / MISSING / UNKNOWN_SOURCE (plus INVALID_REF when ref_status='UNNORMALIZED', READ_BLOCKED when the source schema is read-denied).
  • Grammar-locked: both Ingredient Map and Inventory View import this same resolver/grammar; neither side hand-rolls a different parse.

7. v0.1 scope decision

  • SAFE to enforce now: dot, collection, view, field, trigger, function, approval, event, procedure.
  • Carry as UNKNOWN_SOURCE: io, checker, template, report.
  • label: enforce with the facet caveat (PARTIAL).
  • This grammar is frozen for v0.1 prototyping; function signatures, a real io/checker SSOT, and a confirmed template source are the first v0.2 follow-ups.