KB-540B

dot-iu-cutter v0.5 — Scale-Index & Volume Execution Roadmap (DESIGN ONLY) (2026-05-17)

7 min read Revision 1

dot-iu-cutterv0.5scale-indexdry-run-at-volumestaged-rolloutcheckpoint-resumeroadmapdesign-onlydieu44

dot-iu-cutter v0.5 — Scale-Index & Volume Execution Roadmap

Date: 2026-05-17 Phase: v0_5_constitution_hardtest_and_information_unit_factory_master_plan Nature: DESIGN ONLY. No index DDL executes. No volume run executes.

Parent: dot-iu-cutter-v0.5-constitution-hardtest-master-plan-2026-05-17.md

1. Where the index package sits in sequencing

The pre-scale index DDL is already authored and GPT-PASSed, with the D-2 partial/full ruling decided, and execution explicitly deferred to a separate sovereign cycle. It is the immediate prerequisite for any volume work.

index_package_status:
  authoring: COMPLETE + GPT-PASS
  execution: NOT AUTHORIZED (deferred)
  package: knowledge/dev/laws/dieu44-trien-khai/v0.5-pre-scale-index-ddl-authoring/
approved_index_set (7 hot paths):
  full_btree:
    - idx_dbe_status_emitted_keyset  decision_backlog_entry(status,emitted_at,entry_id)
    - idx_me_source_doc_ref          manifest_envelope(source_doc_ref)
    - idx_rd_manifest_id             review_decision(manifest_id)
    - idx_vr_change_set_id           verify_result(change_set_id)
  partial_btree (WHERE col IS NOT NULL):
    - idx_ccs_dbe_id                 cut_change_set(decision_backlog_entry_id)
    - idx_dps_xref_cs                dot_pair_signature(cross_reference_change_set_id)
    - idx_dps_xref_vr                dot_pair_signature(cross_reference_verify_result_id)

Rationale recap: single-IU trial is safe unindexed, but full-document/bulk is O(n²) on SWEEP cursor + lineage + cut-once guard without these indexes.

2. Roadmap (gated, none authorized now)

roadmap:
  Q1_index_dry_run:
    do: execute the 7 indexes ONLY in an isolated restored-schema DB
    verify: catalog-structural assertions (NOT pg_get_indexdef string equality)
    gate: GPT review of dry-run
  Q2_index_command_review_then_production:
    do: command-review package; CREATE INDEX CONCURRENTLY for production
    gate: sovereign GPT/User approval; post-run structural verification + backup/restore
  Q3_dry_run_at_volume:
    fixture: existing 3-doc corpus and/or synthetic doc (OD-V1)
    measure: EXPLAIN/timing with vs without indexes; invariant; resume; no dup cut
    gate: GPT review
  Q4_tier_normalization_if_needed:
    scope: DIEU_32 / DIEU_35 blank-tier — separate read-review-write cycle
  Q5_label_metadata_registry_design_cycle: schema design only
  Q6_source_registry_ingestion_design_cycle: source authority + parser profiles
  Q7_grammar_profile_validation: incomex-architecture-constitution-v4 (no cut)
  Q8_hien_phap_dry_run_at_volume: full Constitution in isolated env
  Q9_hien_phap_staged_production_small_batch: bounded + checkpoint/resume + sovereign

Cắt hiến pháp becomes available only after Q9, per-batch.

3. Dry-run-at-volume plan (design)

dry_run_at_volume:
  environment: isolated DB restored from production schema backup (NO prod touch)
  fixtures (OD-V1):
    - F1: replay existing DIEU_28/32/35 corpus shape at multiplied volume
    - F2: synthetic N-IU document (parameterized N = 1e2,1e3,1e4)
    - F3: real Constitution (only at Q8, after grammar profile validated)
  assertions:
    - row-delta invariant: +15 per IU (per-IU manifest, OD-M1) OR documented revision
    - rerun delta-0: replaying a completed batch creates zero new rows (idempotency)
    - checkpoint/resume: kill mid-batch at IU k; resume; final state == uninterrupted
    - no duplicate cut: cut-once guard holds under concurrency + resume
    - DOT lane separation: cutter_exec / cutter_verify lanes never overlap at volume
    - performance: EXPLAIN ANALYZE on the 7 hot paths; verify index usage; record
      p50/p95 per stage at each N; flag any seq-scan on hot path as FAIL
  exit: GPT review; no production write implied by any of this

4. Checkpoint / resume model (design)

checkpoint:
  unit: per-IU (atomic per-phase txn already validated in v0.4 RERUN4)
  cursor: decision_backlog_entry(status, emitted_at, entry_id) keyset (indexed in Q2)
  resume: on restart, SWEEP continues from last committed entry; completed IUs
          re-evaluated as no-op (idempotent) -> delta-0
  batch_bound: max_iu_per_batch is a config parameter, NOT a hardcoded constant
  failure: a failing IU is parked (forward-compensation), batch continues per policy

5. Staged production rollout (design)

staged_rollout:
  precondition: Q1..Q3 PASS + Q8 PASS + sovereign approval
  batching:
    - explicit target range per batch (e.g. Điều 0..5), recorded
    - small first batch, widen only after each batch GPT-reviewed
  per_batch: backup -> bounded execute -> verify -> review -> next
  rollback_policy:
    - NO document-wide delete rollback
    - per-IU forward-compensation only (append corrective ledger rows)
    - a bad IU is compensated forward, never the whole document reverted
  abort: any invariant breach halts the batch; no auto-continue

6. Volume estimate caveat

Handoff estimated 300–500 leaf IUs / 5000–7500 governance rows assuming a clause/point grammar. The fixture is not clause/point shaped (canonicalization doc §6). Volume estimate is unreliable until OD-G2 (leaf granularity) is ruled. Do not size batches off the handoff number.

7. Open decisions

open_decisions:
  OD-V1: dry-run-at-volume fixture choice (F1/F2/F3 mix)
  OD-V2: target N ladder for synthetic scaling (1e2 -> 1e6)
  OD-V3: per-stage performance pass/fail thresholds
  OD-I1: index execution route (continue Q1 now vs hold for master-plan ratification)
  OD-M1: per-IU vs document-level manifest (drives invariant)

8. Do not run yet

No index DDL execution, no dry-run, no volume run, no production batch, no checkpoint write, no schema migration, no code change. Design only. Forbidden list = master plan §10.

9. Git

git: { branch: main, HEAD: e93424b5ff7fa5e4b8406131977ce4339cd0856a,
       status_short_iu_cutter: clean, code_changed: false, commit_made: false }