PIDX Build Design v0.1 — PG-native Procedure Index (construction-ready)
PIDX Build Design v0.1 — PG-native Procedure Index
Path:
knowledge/dev/laws-new/workflow-manage/design/pidx-build-design-v0.1.mdStatus: BUILD DESIGN · v0.1 · DESIGN ONLY · NON-AUTHORIZING · 0 DDL executed · 0 PG objects created · 0 PG/Directus/DOT mutation Date: 2026-06-23 Macro: T2 — Hardening + Build Design for PG-native Procedure Index v0.1 Inputs hardened:de-bai-quan-ly-quy-trinh.md(rev21/v0.8),reports/wf-procedure-index-readonly-survey-2026-06-23.md,design/ref-grammar-v0.1.md,design/minimal-technical-design-v0.1.md,design/seed-procedure-candidates-v0.1.md,reports/wf-procedure-index-go-no-go-v0.1.md. Companions (this package):pidx-ddl-candidate-v0.1.sql.md,pidx-readiness-logic-v0.1.md,pidx-seed-slice-v0.1.md,pidx-test-plan-v0.1.md,pidx-codex-review-packet-v0.1.md,reports/pidx-build-design-go-no-go-v0.1.md. Principle locked: PG reads PG. PG is the truth. The Procedure Index is the eye, not the hand. The Ingredient Map is lazy declaration/cache only. Readiness is computed from PG/SQL facts. Vector/RAG/fuzzy may suggest candidates only; SQL confirms truth.
0. What this macro is and is NOT
This is a build-design hardening of the accepted GO-CONDITIONAL result. It converts the T3 survey + minimal design into a construction-ready package Codex can review before Owner approves any build.
- NOT a build macro. No DDL/DML executed. No tables/views/functions/triggers created. No migration, no Directus collection, no DOT registration, no scheduler/bus, no PG/Directus/DOT/governance/birth/KG/production mutation.
- All SQL in this package is candidate text only. The only writes performed by this macro are KB design/report documents (this file and its companions).
1. Task 1 — Final object naming & scope (decision)
1.1 Final names (frozen for v0.1)
| Role | Final v0.1 name | Kind | T3 working name (superseded) |
|---|---|---|---|
| Procedure memory | pidx_procedure |
table | wf_procedure |
| Ingredient declaration (lazy cache) | pidx_procedure_ingredient |
table | wf_procedure_ingredient |
| Inventory "pane of glass" over PG | v_pidx_inventory_current |
view | wf_inventory_current |
| Computed readiness rollup | v_pidx_procedure_readiness |
view | wf_procedure_readiness |
| Procedure↔procedure links | pidx_procedure_link |
table | DEFERRED — not in v0.1 |
Decision: use the pidx_* prefix, not wf_*. Justification (this resolves the GO/NO-GO §4 open risk):
- The live DB already has a populated
wf_*family that is a different thing —wf_census,wf_orphan*,wf_process_candidate(19),wf_fs_dot_bin_snapshot(289),wf_host_crontab_snapshot(54),workflow*(BPMN). Naming the Procedure Indexwf_procedure*invites conceptual collision with that scanner/snapshot family even though the exact names are free. pidx_= Procedure Index. It reads as "an index/con mắt," not "a workflow engine," which is exactly the boundary the problem statement enforces (eye, not hand).- Views keep the substrate's own idiom: a
v_prefix (the DB already carries 49v_*readiness*views + 685 views total). Sov_pidx_inventory_current/v_pidx_procedure_readiness. - No strong justification exists to keep
wf_*, therefore the defaultpidx_*stands. (The macro requireswf_*be kept only with explicit justification; none is warranted.)
1.2 Scope clarifications (locked)
pidx= Procedure Index — an index / con mắt. It is not a workflow engine, governance layer, registry mesh, KG, or DOT runtime.workflow-manageis the folder/domain name only. It does not imply a workflow engine.- The existing
workflows(2 rows, BPMN,process_code=WF-001/WF-002),workflow_steps(70),workflow_categories(3),wf_process_candidate(19) are source inventory, indexed as the object kindprocedure:WF-*. They are not replaced, migrated, or extended bypidx_*.
1.3 Build scope: 2T2V (corrects the T3 "3T2V" label)
The v0.1 prototype is 2 tables + 2 views — "2T2V", not "3T2V".
This corrects the GO/NO-GO verdict-code string ..._GO_CONDITIONAL_OWNER_APPROVES_3T2V, which mislabeled the Option B-minus shape. B-minus is:
2 tables: pidx_procedure, pidx_procedure_ingredient
2 views: v_pidx_inventory_current, v_pidx_procedure_readiness
pidx_procedure_link is deferred out of v0.1 (procedure links live, when needed, in a links_jsonb note column on pidx_procedure; promote to a real pidx_procedure_link table only when M2M links are queried enough to need indexing). The "3 tables + a few views" line in the problem statement (§9) is an upper ceiling, not a build target — §9 explicitly says "if fewer than 3 tables works, choose fewer." 2T2V honors that.
2. Architecture at a glance
┌──────────────────────────────────────────────────────────┐
DECLARED │ pidx_procedure one row = one known procedure │ (lazy human/Agent
(hints, │ pidx_procedure_ingredient one row = one declared need │ cache; NEVER truth)
cache) └───────────────┬──────────────────────────────────────────┘
│ ingredient_ref (ref grammar v0.1 = the join key)
▼
COMPUTED ┌──────────────────────────────────────────────────────────┐
(PG soi PG, │ v_pidx_procedure_readiness per-ingredient + per-proc │
SQL truth) │ ⇗ resolves each ref via narrow EXISTS probe on: │
│ v_pidx_inventory_current UNION over EXISTING sources │
└───────────────┬──────────────────────────────────────────┘
│ object_ref (same ref grammar)
▼
GROUND TRUTH dot_tools · information_schema/pg_catalog · apr_action_types ·
event_type_registry · taxonomy · workflows · directus_collections …
- Declared layer (
pidx_procedure,pidx_procedure_ingredient): what a procedure says it is and needs. Stores only declared fields. Never stores computed existence/readiness. - Computed layer (
v_pidx_*views): what PG can prove.v_pidx_inventory_currentnormalizes existing PG sources into a common contract;v_pidx_procedure_readinessjoins declared ingredient refs to live existence and rolls up readiness. - The join key that fuses the two layers is the ref grammar (
ingredient_ref≡object_ref). It is the single most load-bearing artifact; frozen in §3 here and inpidx-readiness-logic-v0.1.md.
3. Task 2 — Ref Grammar v0.1 freeze (build-facing summary)
Ref grammar v0.1 is FROZEN for the build. Canonical shape:
<kind>:<identifier>
3.1 Two normalization classes
- Class CATALOG — identifier is a PG object path. Kinds:
collection,view,field,trigger,function.- Lowercased whole (PG folds unquoted identifiers).
- Schema required-or-defaulted to
public; store the resolved schema-qualified form (collection:public.dot_tools, nevercollection:dot_tools). - Existence proven against
information_schema/pg_catalog→ always real-timefresh.
- Class CODE — identifier is a governed code. Kinds:
dot,approval,label,procedure,event,io,checker,template,report.- Case-SENSITIVE — stored exactly as the SSOT stores it (
DOT_KG_EXPLAIN/DOT-063upper;patch_ops_codelower_snake; taxonomy mixed). A case mismatch is a real miss by design — do NOT auto-fold. - No schema. Optional facet/namespace qualifier where the code is not globally unique (
label:<facet>.<code>,event:<domain>.<type>). - Existence proven against the registry table for that kind.
- Case-SENSITIVE — stored exactly as the SSOT stores it (
3.2 The four resolution states (load-bearing distinction)
| State | Meaning |
|---|---|
| MISSING | Source is known, object not found → "go create it." |
| UNKNOWN_SOURCE | No clean PG source/probe exists for the kind → "triage the source; do NOT fabricate." |
| INVALID_REF | Ref cannot be parsed / not normalized (ref_status='UNNORMALIZED' or structurally malformed). |
| READ_BLOCKED | Source exists but the schema is read-denied (cutter_governance.*, sandbox_tac.*). |
No silent fuzzy matching is allowed in readiness. If a ref cannot be confidently normalized, store it raw with
ref_status='UNNORMALIZED'→ Readiness yieldsINVALID_REF, never a guessed join.
3.3 Per-kind freeze table (v0.1)
| kind | class | canonical pattern | PG probe (live source) | case | schema/ns | not-found fallback | v0.1 verdict |
|---|---|---|---|---|---|---|---|
dot |
CODE | dot:<CODE> |
dot_tools.code (309) |
sensitive | none | MISSING |
ENFORCE |
collection |
CATALOG | collection:<schema>.<table> |
information_schema.tables ∪ directus_collections |
lower | default public |
MISSING |
ENFORCE |
view |
CATALOG | view:<schema>.<view> |
information_schema.views + pg_class relkind in ('v','m') |
lower | default public |
MISSING |
ENFORCE |
field |
CATALOG | field:<schema>.<table>.<column> |
information_schema.columns |
lower | default public |
MISSING |
ENFORCE |
trigger |
CATALOG | trigger:<schema>.<table>.<name> |
information_schema.triggers (410) |
lower | default public |
MISSING |
ENFORCE |
function |
CATALOG | function:<schema>.<name> |
pg_proc ⋈ pg_namespace (name-only) (618) |
lower | default public |
MISSING |
ENFORCE (overload caveat) |
approval |
CODE | approval:<action_code> |
apr_action_types.action_code (14) |
sensitive | none | MISSING |
ENFORCE |
event |
CODE | event:<domain>.<type> |
event_type_registry(event_domain,event_type) (40) |
sensitive | domain ns | MISSING |
ENFORCE |
procedure |
CODE | procedure:<code> |
pidx_procedure.procedure_code then workflows.process_code |
sensitive | none | MISSING |
ENFORCE |
label |
CODE | label:<code> or label:<facet>.<code> |
taxonomy.code (facet-scoped) (55) |
sensitive | optional facet | AMBIGUOUS→meta, else MISSING |
PARTIAL (facet caveat) |
io |
CODE | io:<code> |
only dot_agent_api_contract.dot_code (narrow, 2) |
sensitive | none | UNKNOWN_SOURCE |
UNKNOWN_SOURCE |
checker |
CODE | checker:<code> |
scattered, no SSOT | sensitive | none | UNKNOWN_SOURCE |
UNKNOWN_SOURCE |
template |
CODE | template:<code> |
unconfirmed (NEEDS_OWNER_DECISION) | sensitive | none | UNKNOWN_SOURCE |
UNKNOWN_SOURCE |
report |
CODE | report:<code> |
KB/agent-data (external to PG) | sensitive | none | UNKNOWN_SOURCE |
UNKNOWN_SOURCE |
Frozen v0.1 split: ENFORCE = dot, collection, view, field, trigger, function, approval, event, procedure; PARTIAL = label; UNKNOWN_SOURCE (always, never MISSING) = io, checker, template, report. Full per-kind normalization/probe detail lives in pidx-readiness-logic-v0.1.md §2.
4. Task 4 — Inventory View contract (v_pidx_inventory_current)
A VIEW = a UNION ALL of per-kind sub-selects over existing sources. It never copies the object universe; each branch reads its own catalog/registry and emits a normalized row. (Full candidate SQL in pidx-ddl-candidate-v0.1.sql.md §3.)
4.1 Common output contract (columns)
object_kind text -- ref-grammar kind
object_ref text -- canonical normalized ref (the join key; grammar §3)
object_code text -- raw code/name as stored in source
object_name text -- human label if available
source_schema text -- e.g. public
source_object text -- source table/catalog (dot_tools, information_schema.tables, …)
source_pk text -- source row identity (code / schema.table / oid)
exists_bool boolean -- a materialized source row = it exists (always true within the view)
authority_level text -- authoritative_catalog | authoritative_registry | registry_cache | declared | unknown
metadata_jsonb jsonb -- kind-specific extras (risk_level, handler_ref, overloaded, logical/physical, facet…)
freshness_status text -- fresh | stale_but_acceptable | refresh_required | unknown_source
last_seen_at timestamptz
notes text
warning_flags text[] -- e.g. {OVERLOADED_FUNCTION}, {APPROVAL_HANDLER_UNIMPLEMENTED}, {AMBIGUOUS_LABEL}, {LOGICAL_PHYSICAL_MISMATCH}
4.2 v0.1 UNION branches (9 SAFE + 1 PARTIAL), mapped to T3 source map
| # | object_kind | source(s) | authority_level | last_seen_at | freshness | warning flags emitted |
|---|---|---|---|---|---|---|
| 1 | dot |
dot_tools (309) |
authoritative_registry | date_updated |
fresh | — |
| 2 | collection |
information_schema.tables (346) LEFT JOIN directus_collections |
authoritative_catalog | now() | fresh | LOGICAL_PHYSICAL_MISMATCH (logical XOR physical) |
| 3 | view |
information_schema.views + pg_class relkind m (685+1) |
authoritative_catalog | now() | fresh | — |
| 4 | field |
information_schema.columns |
authoritative_catalog | now() | fresh | — (heavy; see perf §6 — exposed, not eagerly scanned) |
| 5 | trigger |
information_schema.triggers (410) |
authoritative_catalog | now() | fresh | — |
| 6 | function |
pg_proc ⋈ pg_namespace (618) |
authoritative_catalog | now() | fresh | OVERLOADED_FUNCTION (name count > 1) |
| 7 | approval |
apr_action_types (14) |
authoritative_registry | created_at |
fresh | APPROVAL_HANDLER_UNIMPLEMENTED (handler_ref null/'unimplemented') |
| 8 | event |
event_type_registry (40) |
authoritative_registry | created_at |
fresh | — |
| 9 | procedure |
pidx_procedure (self) ∪ workflows.process_code (2) |
authoritative_registry | updated_at/date_updated |
fresh | — |
| 10 | label (PARTIAL) |
taxonomy (55) |
authoritative_registry | date_created |
fresh | AMBIGUOUS_LABEL (bare code, multi-facet) |
Deliberately NOT branches in v0.1 (resolve to UNKNOWN_SOURCE in readiness, emit no inventory rows): io, checker, template, report.
4.3 Anti-monster guard
v_pidx_inventory_current is for browsing / treeview / debugging, not the readiness hot path. It must never be SELECT *-scanned per Agent question. The hot path (§6) uses correlated EXISTS probes keyed by a single procedure's refs. The field branch over information_schema.columns is the only heavy one and is exposed, not eagerly materialized.
5. Task 5 — Readiness View (v_pidx_procedure_readiness) — design pointer
v_pidx_procedure_readiness joins declared pidx_procedure_ingredient rows to live existence (resolved via the §6 narrow probe) and rolls up per procedure. Full precedence logic, the complete status set, warning flags, and the anti-false-green rules are specified in pidx-readiness-logic-v0.1.md. Candidate SQL in pidx-ddl-candidate-v0.1.sql.md §4. Summary of the truth rule:
Only PG/SQL-derived existence may set
READY.manifest_jsonb,declared_maturity, notes, seed status, and any future RAG output may route and suggest but can never setREADY. If all required ingredients exist but warnings remain, the result isREADY_WITH_WARNINGS(orREADY+warning_flags), never a silentREADY. A zero-ingredient procedure computesUNMAPPED, neverREADY.
6. Task 6 — Hot-path performance & refresh model (operating model)
Locked rule: Simple view first. Materialize only if measured slow. Event-driven only if the source already emits a change signal. No new scheduler/bus in v0.1.
6.1 The narrow hot path (this is what keeps "PG reads PG" cheap)
Agent asks readiness for ONE procedure_code
→ SELECT the procedure row (PK lookup on pidx_procedure)
→ SELECT its few ingredient rows (index pidx_procedure_ingredient(procedure_code); tens at most)
→ for each ingredient_ref: ONE narrow correlated EXISTS / point-probe on the right catalog/registry
→ roll up → return readiness
Cost = O(#ingredients of that procedure), NOT O(catalog).
- Never build the full
v_pidx_inventory_currentto answer one readiness question. - Catalog-derived kinds (
dot/collection/view/field/trigger/function) resolve frominformation_schema/pg_class/dot_toolspoint lookups → inherentlyfresh; a pure view is always current, soforce_refreshis a no-op in v0.1. - Registry-cache kinds (
trigger_registry,collection_registry) can trail DDL → when used, markauthority_level='registry_cache'and derive freshness from theirdate_updated; prefer the catalog over the cache when both exist.
6.2 Freshness vocabulary
fresh | stale_but_acceptable | refresh_required | unknown_source
stale_but_acceptable→ return withlast_seen_at.- Sensitive ops demand
fresh/force_refresh. unknown_source→ never triggers a full catalog sweep.
6.3 What v0.1 must NEVER do
- Scan all 346 tables / 685 views / 618 functions /
entity_labels(852k) for every Agent question. - Build a new scheduler, event bus, or refresh engine.
- Add a materialized view unless measured slow — then materialize only the heavy branch (e.g.
field) withlast_seen_at+refresh_mode; keep cheap branches live. - Ride an existing change signal only if needed later (
registry_changelog~87k,directus_revisions~343k,_recon_dot_fs_inventory,event_outbox) — v0.1 does not need this.
7. Task 7 (RAG/vector/fuzzy) — DEFER all of it in v0.1
Live fact (T3): installed extensions = btree_gist, pgcrypto, plpgsql, postgres_fdw. No pg_trgm. No vector.
- Exact questions (exists? missing? ready?) → SQL / Readiness View only. Vector/RAG may never answer exists/missing/ready.
- Discovery/fuzzy over 6–50 procedures → native
ILIKEonpidx_procedure.title/short_note/intent/domain_group. No extension, no pipeline. pg_trgm(trigram ranking) needsCREATE EXTENSION pg_trgm— owner-gated, out of read-only scope; propose only if procedure count reaches the hundreds and ILIKE ranking is too blunt. Stays PG-native.- Vector/Qdrant is external (
iu_qdrant_collection_registryis a PG pointer). Reserve for large scale, two-step only: vector suggestsprocedure_code→ SQL/Readiness confirms exists/missing/route. Never authoritative. DEFER.
8. What v0.1 deliberately does NOT do (non-goals re-asserted)
No workflow engine; no governance registry / KG / birth integration as a prerequisite; no all_objects mega-registry; no auto-fix; no new scheduler/bus; no manifest/RAG/note as truth; no requirement that all 39 procedures or all ingredients be mapped first; no DDL executed by this macro; pidx_procedure_link deferred. The Procedure Index is the eye, not the hand.
9. Open decisions carried to Owner/Codex (none block Codex review)
- DDL authorization —
pidx_procedure+pidx_procedure_ingredient(+2 views) are net-new objects → a governed/canonical create. Owner must approve via the governed path (patch_ops_codeAPR or a registered migration DOT). This design does NOT authorize it. template:source —iu_collection_template_registryvsdesign_templatesvs none → NEEDS_OWNER_DECISION; until decided,template:staysUNKNOWN_SOURCE.UNKNOWN_SOURCEkinds acceptable for v0.1 — Owner confirmsio/checker/template/reportstayUNKNOWN_SOURCErather than blocking v0.1 to model them.workflowsoverlap — confirmprocedure:WF-*is indexed (not migrated); the two stores (system procedures vs human BPMN) are intentionally distinct.pg_trgm— deferred; approveCREATE EXTENSION pg_trgmlater as a separate small step only if fuzzy ranking is wanted.
Verdict for this macro: GO_FOR_CODEX_REVIEW (see reports/pidx-build-design-go-no-go-v0.1.md).