dot-iu-cutter v0.5 — Pre-Scale Index Hot-Path Analysis (design only) (2026-05-17)
dot-iu-cutter v0.5 — Pre-Scale Index Hot-Path Analysis
Date: 2026-05-17 · Status: DESIGN / READ-ONLY ANALYSIS ONLY — no index DDL, no write, nothing executed. Parent phase: v0.5 pre-scale foundation (GPT PASS_WITH_BLOCKERS).
Grounding: read-only introspection of deployed cutter_governance columns + accepted cutter_agent/phases.py,db_adapter.py @ e93424b5ff7fa5e4b8406131977ce4339cd0856a. RealPostgresAdapter.find(table,**eq) emits SELECT * FROM cutter_governance.<t> WHERE <col=val …>; the eq keys are the real deployed columns (post v0.4 schema-binding). CAS update uses WHERE entry_id=%s AND status….
1. Runtime hot paths (per-IU) → exact column, current index, scale risk
| # | Hot path | Code site | Table.column(s) queried | Indexed today? | Scale behaviour |
|---|---|---|---|---|---|
| 1 | MARK idempotency lookup | phases.mark → find(decision_backlog_entry, entry_id=) |
decision_backlog_entry.entry_id (uuid, PK) |
Yes — PK btree | O(log n), fine at scale |
| 2 | SWEEP cursor | phases.sweep → find(decision_backlog_entry, status=) then sort by (emitted_at, entry_id) |
decision_backlog_entry.status (text); order key emitted_at,entry_id |
No | Seq scan + sort every sweep → O(n) each, O(n²) per document |
| 3 | Lineage lookup (envelope) | phases._reviews_for_entry → find(manifest_envelope, source_doc_ref=) |
manifest_envelope.source_doc_ref (text, NN) |
No | Seq scan per IU |
| 4 | Lineage lookup (review) | phases._reviews_for_entry → find(review_decision, manifest_id=) |
review_decision.manifest_id (uuid, NN); live tail filters superseded_by_review_decision_id IS NULL |
No | Seq scan per IU |
| 5 | Cut-once guard | phases.cut → find(cut_change_set, decision_backlog_entry_id=) |
cut_change_set.decision_backlog_entry_id (uuid, nullable) |
No | Seq scan per IU (G-CUT-ONCE) |
| 6 | Verify lookup | phases.verify → find(verify_result, change_set_id=) (+ prior find(cut_change_set, decision_backlog_entry_id=)) |
verify_result.change_set_id (uuid) |
No | Seq scan per IU |
| 7 | DOT signature lookup | xref/postcondition + future revoke/chain reads | dot_pair_signature.cross_reference_change_set_id, …_verify_result_id (uuid, deployed XOR — exactly one non-null); prior_signature_id |
No | Seq scan when queried; XOR ⇒ partial-index candidate |
| 8 | Manifest / unit-block lookup | phases.review writes; reads by envelope |
manifest_unit_block (envelope_id, unit_local_id) composite PK; manifest_envelope.envelope_id PK |
Yes — composite PK / PK (envelope-prefix lookups covered) | OK; no new index needed |
| 9 | Dependency guard (per-IU) | phases.cut → find(decision_backlog_dependency, from_entry_id=) |
decision_backlog_dependency.from_entry_id (uuid, NN) |
No | Seq scan (0 rows today; additive-only later) |
2. Why single-IU was fine but full-document is not
Single-IU trial ran against an empty family (baseline all 0) → every seq scan touched ≤1 row. A full document drives N independent per-IU pipelines while the tables grow to ~15·N rows; each per-IU lookup in rows 2–7 re-scans the growing table → aggregate cost ≈ O(n²). At ~300–500 IUs (~5,000–7,500 governance rows) this is the dominant cost and the concrete reason GPT blocked full-document/bulk pending pre-scale index DDL.
3. Indexed-vs-unindexed summary
- Already adequate (no new index): #1 MARK (PK), #8 manifest/unit-block (composite PK / PK),
cut_change_setunique(idempotency_key)/(rollback_key), all PKs. - Needs additive index before scale: #2 SWEEP (
decision_backlog_entry.status+ keyset), #3 (manifest_envelope.source_doc_ref), #4 (review_decision.manifest_id), #5 (cut_change_set.decision_backlog_entry_id), #6 (verify_result.change_set_id), #7 (dot_pair_signaturexref cols, partial), #9 (decision_backlog_dependency.from_entry_id).
The concrete additive index proposals (CREATE INDEX CONCURRENTLY, no rewrite, no semantic/data migration) are specified — proposal only, not authorized — in …-index-only-ddl-design-for-runtime-paths-….
Boundaries / Git
Read-only analysis only; no index DDL, no write, no code/commit. Git: branch main · HEAD e93424b5ff7fa5e4b8406131977ce4339cd0856a · git status --short -- iu-cutter = clean (0 lines). No fixed IP/DSN/password/container/vector-collection; no runtime label/key hardcoding; no schema change; SQL = SSOT. Next = GPT review.