KB-5DD9

dot-iu-cutter v0.5 — Full-Document Dry-Run-at-Volume Plan (design only; not executed) (2026-05-17)

4 min read Revision 1
dot-iu-cutterv0.5pre-scale-foundationdry-run-at-volumedesign-onlydieu44

dot-iu-cutter v0.5 — Full-Document Dry-Run-at-Volume Plan

Date: 2026-05-17 · Status: DESIGN / PLAN ONLY — NOT executed. No production, no CUT/VERIFY, no provisioning. Parent: v0.5 pre-scale foundation.

1. Purpose & non-negotiable gate

Validate the per-IU pipeline at document volume in an isolated env before any bulk/production full-document cut. No bulk production cut until this dry-run-at-volume PASSes (and the index-DDL + tier-normalisation cycles are resolved).

2. Environment (proven pattern, reused)

Same as the RERUN#4/production-trial dry-run discipline: ephemeral, exact-name postgres:16 (e.g. pg-dryrun-v0.5-volume-2026-..), no published port, fresh read-only pg_dump of production restored in, dry-run-only cutter_exec/cutter_verify (dry-run passwords, not prod SCRAM), G-10 DR-sysid ≠ prod 7611578671664259111 hard abort, ephemeral psycopg3 harness mounting accepted e93424b5… iu-cutter read-only (no code change), env-destruction teardown, 3 protected prior dry-run envs untouched (exact-name, no prune). Production accessed read-only only.

3. Volume model

  • Target ≈ 300–500 leaf IUs (Hiến-pháp-scale worked example; exact = post-ingestion, OD-5). Source = a synthetic/sampled fixture set OR a restored copy's existing corpus replayed N× — OD-V1 (synthetic vs real-sample); no real Hiến pháp ingestion here.
  • Expected governance rows = 15 × IU (validated invariant): ~4,500–7,500 rows. Per-table expectation = the per-IU +15 matrix × N (cell-for-cell, like verification-plan r3 but ×N).

4. Batching + checkpoint/resume

  • Drive IUs in deterministic order (canonical_address / sort_order) in bounded batches (size config-driven, clamped — reuse DOT_CUTTER_SWEEP_BATCH-style knob, no hardcoded number).
  • Checkpoint after each batch: persisted progress = the set of completed entry_ids (deterministic uuid5 of the idempotency key).
  • Resume: re-running a batch re-MARKs; phases.mark idempotency resolves each already-done IU to its existing entry (PK lookup) → no duplicate, no double-cut (G-CUT-ONCE). Safe stop/resume by construction.
  • Per-IU failure → STOP that IU, preserve prior committed IUs (append-only), forward-compensate the failed IU, honest report; no document-wide rollback/delete.

5. Measurement (the point of "at volume")

  • With the proposed additive indexes applied in the dry-run env only: EXPLAIN the §hot-path predicates → assert index scan, not seq scan; capture per-batch wall time and growth curve (must be ~linear, not O(n²)).
  • Optionally one unindexed baseline batch to quantify the index benefit (OD-I3 / OD-V2).
  • Verdict = PASS iff: per-IU +15 invariant holds ×N cell-for-cell; lineage/sweep/G-CUT-ONCE correct at volume; timing scales ~linearly with indexes; negative/idempotency Δ=0 (replay no-dupe); no forbidden SQL; isolation + production-untouched + protected-envs-untouched; teardown net-zero.

6. Sequencing dependencies

Dry-run-at-volume is gated by: index-only-DDL design GPT-approved (so indexes can be applied in the dry-run env), volume-source decided (OD-V1), batch policy confirmed. It in turn gates any staged production full-document trial. None of this is authorized to execute now.

7. Open decisions for GPT

  • OD-V1 Volume source: synthetic fixtures vs replay of restored real corpus.
  • OD-V2 Measure unindexed baseline too (evidence) vs indexed-only.
  • OD-V3 Volume target N for the first dry-run (full ~500 vs a representative subset first).
  • OD-V4 Batch size policy/knob + checkpoint persistence location (dry-run-only file vs DR table).

Boundaries / Git

Plan only — nothing executed/provisioned; no production, CUT/VERIFY, code, commit. Git main · e93424b5ff7fa5e4b8406131977ce4339cd0856a · clean (0 lines). No fixed IP/DSN/password/container/vector-collection; no runtime label/key hardcoding; SQL = SSOT; vector/NoSQL projection/search only. Next = GPT review.

Back to Knowledge Hub knowledge/dev/laws/dieu44-trien-khai/v0.5-pre-scale-foundation-design/dot-iu-cutter-v0.5-full-document-dry-run-at-volume-plan-2026-05-17.md