KB-3AE4

PIDX Build Design v0.1 — PG-native Procedure Index (construction-ready)

21 min read Revision 1
workflow-manageprocedure-indexpidxbuild-designdesign2T2Vinventory-viewreadiness-viewref-grammarperformancerag-boundaryv0.12026-06-23codex-review-ready

PIDX Build Design v0.1 — PG-native Procedure Index

Path: knowledge/dev/laws-new/workflow-manage/design/pidx-build-design-v0.1.md Status: BUILD DESIGN · v0.1 · DESIGN ONLY · NON-AUTHORIZING · 0 DDL executed · 0 PG objects created · 0 PG/Directus/DOT mutation Date: 2026-06-23 Macro: T2 — Hardening + Build Design for PG-native Procedure Index v0.1 Inputs hardened: de-bai-quan-ly-quy-trinh.md (rev21/v0.8), reports/wf-procedure-index-readonly-survey-2026-06-23.md, design/ref-grammar-v0.1.md, design/minimal-technical-design-v0.1.md, design/seed-procedure-candidates-v0.1.md, reports/wf-procedure-index-go-no-go-v0.1.md. Companions (this package): pidx-ddl-candidate-v0.1.sql.md, pidx-readiness-logic-v0.1.md, pidx-seed-slice-v0.1.md, pidx-test-plan-v0.1.md, pidx-codex-review-packet-v0.1.md, reports/pidx-build-design-go-no-go-v0.1.md. Principle locked: PG reads PG. PG is the truth. The Procedure Index is the eye, not the hand. The Ingredient Map is lazy declaration/cache only. Readiness is computed from PG/SQL facts. Vector/RAG/fuzzy may suggest candidates only; SQL confirms truth.


0. What this macro is and is NOT

This is a build-design hardening of the accepted GO-CONDITIONAL result. It converts the T3 survey + minimal design into a construction-ready package Codex can review before Owner approves any build.

  • NOT a build macro. No DDL/DML executed. No tables/views/functions/triggers created. No migration, no Directus collection, no DOT registration, no scheduler/bus, no PG/Directus/DOT/governance/birth/KG/production mutation.
  • All SQL in this package is candidate text only. The only writes performed by this macro are KB design/report documents (this file and its companions).

1. Task 1 — Final object naming & scope (decision)

1.1 Final names (frozen for v0.1)

Role Final v0.1 name Kind T3 working name (superseded)
Procedure memory pidx_procedure table wf_procedure
Ingredient declaration (lazy cache) pidx_procedure_ingredient table wf_procedure_ingredient
Inventory "pane of glass" over PG v_pidx_inventory_current view wf_inventory_current
Computed readiness rollup v_pidx_procedure_readiness view wf_procedure_readiness
Procedure↔procedure links pidx_procedure_link table DEFERRED — not in v0.1

Decision: use the pidx_* prefix, not wf_*. Justification (this resolves the GO/NO-GO §4 open risk):

  • The live DB already has a populated wf_* family that is a different thingwf_census, wf_orphan*, wf_process_candidate (19), wf_fs_dot_bin_snapshot (289), wf_host_crontab_snapshot (54), workflow* (BPMN). Naming the Procedure Index wf_procedure* invites conceptual collision with that scanner/snapshot family even though the exact names are free.
  • pidx_ = Procedure Index. It reads as "an index/con mắt," not "a workflow engine," which is exactly the boundary the problem statement enforces (eye, not hand).
  • Views keep the substrate's own idiom: a v_ prefix (the DB already carries 49 v_*readiness* views + 685 views total). So v_pidx_inventory_current / v_pidx_procedure_readiness.
  • No strong justification exists to keep wf_*, therefore the default pidx_* stands. (The macro requires wf_* be kept only with explicit justification; none is warranted.)

1.2 Scope clarifications (locked)

  • pidx = Procedure Index — an index / con mắt. It is not a workflow engine, governance layer, registry mesh, KG, or DOT runtime.
  • workflow-manage is the folder/domain name only. It does not imply a workflow engine.
  • The existing workflows (2 rows, BPMN, process_code = WF-001/WF-002), workflow_steps (70), workflow_categories (3), wf_process_candidate (19) are source inventory, indexed as the object kind procedure:WF-*. They are not replaced, migrated, or extended by pidx_*.

1.3 Build scope: 2T2V (corrects the T3 "3T2V" label)

The v0.1 prototype is 2 tables + 2 views — "2T2V", not "3T2V".

This corrects the GO/NO-GO verdict-code string ..._GO_CONDITIONAL_OWNER_APPROVES_3T2V, which mislabeled the Option B-minus shape. B-minus is:

2 tables:   pidx_procedure, pidx_procedure_ingredient
2 views:    v_pidx_inventory_current, v_pidx_procedure_readiness

pidx_procedure_link is deferred out of v0.1 (procedure links live, when needed, in a links_jsonb note column on pidx_procedure; promote to a real pidx_procedure_link table only when M2M links are queried enough to need indexing). The "3 tables + a few views" line in the problem statement (§9) is an upper ceiling, not a build target — §9 explicitly says "if fewer than 3 tables works, choose fewer." 2T2V honors that.


2. Architecture at a glance

                 ┌──────────────────────────────────────────────────────────┐
   DECLARED      │  pidx_procedure          one row = one known procedure     │  (lazy human/Agent
   (hints,       │  pidx_procedure_ingredient  one row = one declared need    │   cache; NEVER truth)
    cache)       └───────────────┬──────────────────────────────────────────┘
                                 │  ingredient_ref  (ref grammar v0.1 = the join key)
                                 ▼
   COMPUTED      ┌──────────────────────────────────────────────────────────┐
   (PG soi PG,   │  v_pidx_procedure_readiness   per-ingredient + per-proc    │
    SQL truth)   │     ⇗ resolves each ref via narrow EXISTS probe on:        │
                 │  v_pidx_inventory_current     UNION over EXISTING sources  │
                 └───────────────┬──────────────────────────────────────────┘
                                 │  object_ref  (same ref grammar)
                                 ▼
   GROUND TRUTH  dot_tools · information_schema/pg_catalog · apr_action_types ·
                 event_type_registry · taxonomy · workflows · directus_collections …
  • Declared layer (pidx_procedure, pidx_procedure_ingredient): what a procedure says it is and needs. Stores only declared fields. Never stores computed existence/readiness.
  • Computed layer (v_pidx_* views): what PG can prove. v_pidx_inventory_current normalizes existing PG sources into a common contract; v_pidx_procedure_readiness joins declared ingredient refs to live existence and rolls up readiness.
  • The join key that fuses the two layers is the ref grammar (ingredient_refobject_ref). It is the single most load-bearing artifact; frozen in §3 here and in pidx-readiness-logic-v0.1.md.

3. Task 2 — Ref Grammar v0.1 freeze (build-facing summary)

Ref grammar v0.1 is FROZEN for the build. Canonical shape:

<kind>:<identifier>

3.1 Two normalization classes

  • Class CATALOG — identifier is a PG object path. Kinds: collection, view, field, trigger, function.
    • Lowercased whole (PG folds unquoted identifiers).
    • Schema required-or-defaulted to public; store the resolved schema-qualified form (collection:public.dot_tools, never collection:dot_tools).
    • Existence proven against information_schema/pg_catalog → always real-time fresh.
  • Class CODE — identifier is a governed code. Kinds: dot, approval, label, procedure, event, io, checker, template, report.
    • Case-SENSITIVE — stored exactly as the SSOT stores it (DOT_KG_EXPLAIN/DOT-063 upper; patch_ops_code lower_snake; taxonomy mixed). A case mismatch is a real miss by design — do NOT auto-fold.
    • No schema. Optional facet/namespace qualifier where the code is not globally unique (label:<facet>.<code>, event:<domain>.<type>).
    • Existence proven against the registry table for that kind.

3.2 The four resolution states (load-bearing distinction)

State Meaning
MISSING Source is known, object not found → "go create it."
UNKNOWN_SOURCE No clean PG source/probe exists for the kind → "triage the source; do NOT fabricate."
INVALID_REF Ref cannot be parsed / not normalized (ref_status='UNNORMALIZED' or structurally malformed).
READ_BLOCKED Source exists but the schema is read-denied (cutter_governance.*, sandbox_tac.*).

No silent fuzzy matching is allowed in readiness. If a ref cannot be confidently normalized, store it raw with ref_status='UNNORMALIZED' → Readiness yields INVALID_REF, never a guessed join.

3.3 Per-kind freeze table (v0.1)

kind class canonical pattern PG probe (live source) case schema/ns not-found fallback v0.1 verdict
dot CODE dot:<CODE> dot_tools.code (309) sensitive none MISSING ENFORCE
collection CATALOG collection:<schema>.<table> information_schema.tablesdirectus_collections lower default public MISSING ENFORCE
view CATALOG view:<schema>.<view> information_schema.views + pg_class relkind in ('v','m') lower default public MISSING ENFORCE
field CATALOG field:<schema>.<table>.<column> information_schema.columns lower default public MISSING ENFORCE
trigger CATALOG trigger:<schema>.<table>.<name> information_schema.triggers (410) lower default public MISSING ENFORCE
function CATALOG function:<schema>.<name> pg_procpg_namespace (name-only) (618) lower default public MISSING ENFORCE (overload caveat)
approval CODE approval:<action_code> apr_action_types.action_code (14) sensitive none MISSING ENFORCE
event CODE event:<domain>.<type> event_type_registry(event_domain,event_type) (40) sensitive domain ns MISSING ENFORCE
procedure CODE procedure:<code> pidx_procedure.procedure_code then workflows.process_code sensitive none MISSING ENFORCE
label CODE label:<code> or label:<facet>.<code> taxonomy.code (facet-scoped) (55) sensitive optional facet AMBIGUOUS→meta, else MISSING PARTIAL (facet caveat)
io CODE io:<code> only dot_agent_api_contract.dot_code (narrow, 2) sensitive none UNKNOWN_SOURCE UNKNOWN_SOURCE
checker CODE checker:<code> scattered, no SSOT sensitive none UNKNOWN_SOURCE UNKNOWN_SOURCE
template CODE template:<code> unconfirmed (NEEDS_OWNER_DECISION) sensitive none UNKNOWN_SOURCE UNKNOWN_SOURCE
report CODE report:<code> KB/agent-data (external to PG) sensitive none UNKNOWN_SOURCE UNKNOWN_SOURCE

Frozen v0.1 split: ENFORCE = dot, collection, view, field, trigger, function, approval, event, procedure; PARTIAL = label; UNKNOWN_SOURCE (always, never MISSING) = io, checker, template, report. Full per-kind normalization/probe detail lives in pidx-readiness-logic-v0.1.md §2.


4. Task 4 — Inventory View contract (v_pidx_inventory_current)

A VIEW = a UNION ALL of per-kind sub-selects over existing sources. It never copies the object universe; each branch reads its own catalog/registry and emits a normalized row. (Full candidate SQL in pidx-ddl-candidate-v0.1.sql.md §3.)

4.1 Common output contract (columns)

object_kind      text      -- ref-grammar kind
object_ref       text      -- canonical normalized ref (the join key; grammar §3)
object_code      text      -- raw code/name as stored in source
object_name      text      -- human label if available
source_schema    text      -- e.g. public
source_object    text      -- source table/catalog (dot_tools, information_schema.tables, …)
source_pk        text      -- source row identity (code / schema.table / oid)
exists_bool      boolean   -- a materialized source row = it exists (always true within the view)
authority_level  text      -- authoritative_catalog | authoritative_registry | registry_cache | declared | unknown
metadata_jsonb   jsonb     -- kind-specific extras (risk_level, handler_ref, overloaded, logical/physical, facet…)
freshness_status text      -- fresh | stale_but_acceptable | refresh_required | unknown_source
last_seen_at     timestamptz
notes            text
warning_flags    text[]    -- e.g. {OVERLOADED_FUNCTION}, {APPROVAL_HANDLER_UNIMPLEMENTED}, {AMBIGUOUS_LABEL}, {LOGICAL_PHYSICAL_MISMATCH}

4.2 v0.1 UNION branches (9 SAFE + 1 PARTIAL), mapped to T3 source map

# object_kind source(s) authority_level last_seen_at freshness warning flags emitted
1 dot dot_tools (309) authoritative_registry date_updated fresh
2 collection information_schema.tables (346) LEFT JOIN directus_collections authoritative_catalog now() fresh LOGICAL_PHYSICAL_MISMATCH (logical XOR physical)
3 view information_schema.views + pg_class relkind m (685+1) authoritative_catalog now() fresh
4 field information_schema.columns authoritative_catalog now() fresh (heavy; see perf §6 — exposed, not eagerly scanned)
5 trigger information_schema.triggers (410) authoritative_catalog now() fresh
6 function pg_procpg_namespace (618) authoritative_catalog now() fresh OVERLOADED_FUNCTION (name count > 1)
7 approval apr_action_types (14) authoritative_registry created_at fresh APPROVAL_HANDLER_UNIMPLEMENTED (handler_ref null/'unimplemented')
8 event event_type_registry (40) authoritative_registry created_at fresh
9 procedure pidx_procedure (self) ∪ workflows.process_code (2) authoritative_registry updated_at/date_updated fresh
10 label (PARTIAL) taxonomy (55) authoritative_registry date_created fresh AMBIGUOUS_LABEL (bare code, multi-facet)

Deliberately NOT branches in v0.1 (resolve to UNKNOWN_SOURCE in readiness, emit no inventory rows): io, checker, template, report.

4.3 Anti-monster guard

v_pidx_inventory_current is for browsing / treeview / debugging, not the readiness hot path. It must never be SELECT *-scanned per Agent question. The hot path (§6) uses correlated EXISTS probes keyed by a single procedure's refs. The field branch over information_schema.columns is the only heavy one and is exposed, not eagerly materialized.


5. Task 5 — Readiness View (v_pidx_procedure_readiness) — design pointer

v_pidx_procedure_readiness joins declared pidx_procedure_ingredient rows to live existence (resolved via the §6 narrow probe) and rolls up per procedure. Full precedence logic, the complete status set, warning flags, and the anti-false-green rules are specified in pidx-readiness-logic-v0.1.md. Candidate SQL in pidx-ddl-candidate-v0.1.sql.md §4. Summary of the truth rule:

Only PG/SQL-derived existence may set READY. manifest_jsonb, declared_maturity, notes, seed status, and any future RAG output may route and suggest but can never set READY. If all required ingredients exist but warnings remain, the result is READY_WITH_WARNINGS (or READY + warning_flags), never a silent READY. A zero-ingredient procedure computes UNMAPPED, never READY.


6. Task 6 — Hot-path performance & refresh model (operating model)

Locked rule: Simple view first. Materialize only if measured slow. Event-driven only if the source already emits a change signal. No new scheduler/bus in v0.1.

6.1 The narrow hot path (this is what keeps "PG reads PG" cheap)

Agent asks readiness for ONE procedure_code
  → SELECT the procedure row (PK lookup on pidx_procedure)
  → SELECT its few ingredient rows  (index pidx_procedure_ingredient(procedure_code); tens at most)
  → for each ingredient_ref: ONE narrow correlated EXISTS / point-probe on the right catalog/registry
  → roll up → return readiness
Cost = O(#ingredients of that procedure), NOT O(catalog).
  • Never build the full v_pidx_inventory_current to answer one readiness question.
  • Catalog-derived kinds (dot/collection/view/field/trigger/function) resolve from information_schema/pg_class/dot_tools point lookups → inherently fresh; a pure view is always current, so force_refresh is a no-op in v0.1.
  • Registry-cache kinds (trigger_registry, collection_registry) can trail DDL → when used, mark authority_level='registry_cache' and derive freshness from their date_updated; prefer the catalog over the cache when both exist.

6.2 Freshness vocabulary

fresh | stale_but_acceptable | refresh_required | unknown_source
  • stale_but_acceptable → return with last_seen_at.
  • Sensitive ops demand fresh / force_refresh.
  • unknown_source → never triggers a full catalog sweep.

6.3 What v0.1 must NEVER do

  • Scan all 346 tables / 685 views / 618 functions / entity_labels (852k) for every Agent question.
  • Build a new scheduler, event bus, or refresh engine.
  • Add a materialized view unless measured slow — then materialize only the heavy branch (e.g. field) with last_seen_at + refresh_mode; keep cheap branches live.
  • Ride an existing change signal only if needed later (registry_changelog ~87k, directus_revisions ~343k, _recon_dot_fs_inventory, event_outbox) — v0.1 does not need this.

7. Task 7 (RAG/vector/fuzzy) — DEFER all of it in v0.1

Live fact (T3): installed extensions = btree_gist, pgcrypto, plpgsql, postgres_fdw. No pg_trgm. No vector.

  • Exact questions (exists? missing? ready?) → SQL / Readiness View only. Vector/RAG may never answer exists/missing/ready.
  • Discovery/fuzzy over 6–50 procedures → native ILIKE on pidx_procedure.title/short_note/intent/domain_group. No extension, no pipeline.
  • pg_trgm (trigram ranking) needs CREATE EXTENSION pg_trgmowner-gated, out of read-only scope; propose only if procedure count reaches the hundreds and ILIKE ranking is too blunt. Stays PG-native.
  • Vector/Qdrant is external (iu_qdrant_collection_registry is a PG pointer). Reserve for large scale, two-step only: vector suggests procedure_code → SQL/Readiness confirms exists/missing/route. Never authoritative. DEFER.

8. What v0.1 deliberately does NOT do (non-goals re-asserted)

No workflow engine; no governance registry / KG / birth integration as a prerequisite; no all_objects mega-registry; no auto-fix; no new scheduler/bus; no manifest/RAG/note as truth; no requirement that all 39 procedures or all ingredients be mapped first; no DDL executed by this macro; pidx_procedure_link deferred. The Procedure Index is the eye, not the hand.


9. Open decisions carried to Owner/Codex (none block Codex review)

  1. DDL authorizationpidx_procedure + pidx_procedure_ingredient (+2 views) are net-new objects → a governed/canonical create. Owner must approve via the governed path (patch_ops_code APR or a registered migration DOT). This design does NOT authorize it.
  2. template: sourceiu_collection_template_registry vs design_templates vs none → NEEDS_OWNER_DECISION; until decided, template: stays UNKNOWN_SOURCE.
  3. UNKNOWN_SOURCE kinds acceptable for v0.1 — Owner confirms io/checker/template/report stay UNKNOWN_SOURCE rather than blocking v0.1 to model them.
  4. workflows overlap — confirm procedure:WF-* is indexed (not migrated); the two stores (system procedures vs human BPMN) are intentionally distinct.
  5. pg_trgm — deferred; approve CREATE EXTENSION pg_trgm later as a separate small step only if fuzzy ranking is wanted.

Verdict for this macro: GO_FOR_CODEX_REVIEW (see reports/pidx-build-design-go-no-go-v0.1.md).