KB-93F3

dot-iu-cutter v0.5 — Pre-Scale Index Hot-Path Analysis (design only) (2026-05-17)

5 min read Revision 1

dot-iu-cutterv0.5pre-scale-foundationhot-path-analysisindexdesign-onlydieu44

dot-iu-cutter v0.5 — Pre-Scale Index Hot-Path Analysis

Date: 2026-05-17 · Status: DESIGN / READ-ONLY ANALYSIS ONLY — no index DDL, no write, nothing executed. Parent phase: v0.5 pre-scale foundation (GPT PASS_WITH_BLOCKERS). Grounding: read-only introspection of deployed cutter_governance columns + accepted cutter_agent/phases.py,db_adapter.py @ e93424b5ff7fa5e4b8406131977ce4339cd0856a. RealPostgresAdapter.find(table,**eq) emits SELECT * FROM cutter_governance.<t> WHERE <col=val …>; the eq keys are the real deployed columns (post v0.4 schema-binding). CAS update uses WHERE entry_id=%s AND status….

1. Runtime hot paths (per-IU) → exact column, current index, scale risk

#	Hot path	Code site	Table.column(s) queried	Indexed today?	Scale behaviour
1	MARK idempotency lookup	`phases.mark` → `find(decision_backlog_entry, entry_id=)`	`decision_backlog_entry.entry_id` (uuid, PK)	Yes — PK btree	O(log n), fine at scale
2	SWEEP cursor	`phases.sweep` → `find(decision_backlog_entry, status=)` then sort by `(emitted_at, entry_id)`	`decision_backlog_entry.status` (text); order key `emitted_at,entry_id`	No	Seq scan + sort every sweep → O(n) each, O(n²) per document
3	Lineage lookup (envelope)	`phases._reviews_for_entry` → `find(manifest_envelope, source_doc_ref=)`	`manifest_envelope.source_doc_ref` (text, NN)	No	Seq scan per IU
4	Lineage lookup (review)	`phases._reviews_for_entry` → `find(review_decision, manifest_id=)`	`review_decision.manifest_id` (uuid, NN); live tail filters `superseded_by_review_decision_id IS NULL`	No	Seq scan per IU
5	Cut-once guard	`phases.cut` → `find(cut_change_set, decision_backlog_entry_id=)`	`cut_change_set.decision_backlog_entry_id` (uuid, nullable)	No	Seq scan per IU (G-CUT-ONCE)
6	Verify lookup	`phases.verify` → `find(verify_result, change_set_id=)` (+ prior `find(cut_change_set, decision_backlog_entry_id=)`)	`verify_result.change_set_id` (uuid)	No	Seq scan per IU
7	DOT signature lookup	xref/postcondition + future revoke/chain reads	`dot_pair_signature.cross_reference_change_set_id`, `…_verify_result_id` (uuid, deployed XOR — exactly one non-null); `prior_signature_id`	No	Seq scan when queried; XOR ⇒ partial-index candidate
8	Manifest / unit-block lookup	`phases.review` writes; reads by envelope	`manifest_unit_block (envelope_id, unit_local_id)` composite PK; `manifest_envelope.envelope_id` PK	Yes — composite PK / PK (envelope-prefix lookups covered)	OK; no new index needed
9	Dependency guard (per-IU)	`phases.cut` → `find(decision_backlog_dependency, from_entry_id=)`	`decision_backlog_dependency.from_entry_id` (uuid, NN)	No	Seq scan (0 rows today; additive-only later)

2. Why single-IU was fine but full-document is not

Single-IU trial ran against an empty family (baseline all 0) → every seq scan touched ≤1 row. A full document drives N independent per-IU pipelines while the tables grow to ~15·N rows; each per-IU lookup in rows 2–7 re-scans the growing table → aggregate cost ≈ O(n²). At ~300–500 IUs (~5,000–7,500 governance rows) this is the dominant cost and the concrete reason GPT blocked full-document/bulk pending pre-scale index DDL.

3. Indexed-vs-unindexed summary

Already adequate (no new index): #1 MARK (PK), #8 manifest/unit-block (composite PK / PK), cut_change_set unique(idempotency_key)/(rollback_key), all PKs.
Needs additive index before scale: #2 SWEEP (decision_backlog_entry.status + keyset), #3 (manifest_envelope.source_doc_ref), #4 (review_decision.manifest_id), #5 (cut_change_set.decision_backlog_entry_id), #6 (verify_result.change_set_id), #7 (dot_pair_signature xref cols, partial), #9 (decision_backlog_dependency.from_entry_id).

The concrete additive index proposals (CREATE INDEX CONCURRENTLY, no rewrite, no semantic/data migration) are specified — proposal only, not authorized — in …-index-only-ddl-design-for-runtime-paths-….

Boundaries / Git

Read-only analysis only; no index DDL, no write, no code/commit. Git: branch main · HEAD e93424b5ff7fa5e4b8406131977ce4339cd0856a · git status --short -- iu-cutter = clean (0 lines). No fixed IP/DSN/password/container/vector-collection; no runtime label/key hardcoding; no schema change; SQL = SSOT. Next = GPT review.