dot-iu-cutter v0.4 — LedgerWriter Schema-Binding DESIGN Report (2026-05-17) [r2]
dot-iu-cutter v0.4 — LedgerWriter Schema-Binding DESIGN Report
Date: 2026-05-17 · Phase: v0.4 LedgerWriter Schema-Binding DESIGN ONLY. Nothing executed, no code, no commit, no provisioning, no dry-run, no production connection beyond read-only catalog/schema inspection, no secret/.env read.
Revision 2 (2026-05-17). GPT review of the r1 package: schema-binding narrow blocker = PASS-candidate; code authoring NOT allowed yet; r2/addendum required because r1 solved schema binding but did not yet cover the User scale / automation / non-hardcode / SQL-NoSQL-hybrid mandate. r2 adds two docs (
…-scale-automation-nonhardcode-review-…,…-sql-nosql-hybrid-information-unit-strategy-…) and revises the final verdict (§3/§5 below). No design decision from r1 is reversed; the binding is unchanged and now shown to be scale/automation forward-compatible with an expanded (still column-level) code scope and additive-index-only scale provisioning.
1. Deliverables (9 docs total, knowledge/dev/laws/dieu44-trien-khai/v0.4-schema-binding/)
| Doc | Rev | Content |
|---|---|---|
…-ledgerwriter-schema-gap-analysis-… |
r1 | Root cause; 12-writer MATCH/MISMATCH (3/9); representability ⇒ no migration; lineage gap |
…-ledgerwriter-per-writer-mapping-design-… |
r1 | Per-writer A/B reconciliation; constants + SB-DEC-1..6 |
…-state-history-and-sweep-mapping-design-… |
r1 | append_history/append_sweep_log/CAS deep-dive; count invariance |
…-mark-review-cut-verify-schema-binding-plan-… |
r1 | Per-phase FK order; matrix == r3 |
…-pg-backed-test-revision-plan-… |
r1 | Schema-contract tests; r3 preserved (no r4); optional G-26 |
…-schema-binding-risk-and-code-change-plan-… |
r1 | Bounded code surface; no migration; risk STANDARD |
…-scale-automation-nonhardcode-review-… |
r1 (new) | Scale A, non-hardcode B, hybrid C, IU-centric D, automation E |
…-sql-nosql-hybrid-information-unit-strategy-… |
r1 (new) | SQL SSOT, JSONB normalization queue, vector=acceleration, IU lifecycle map |
…-ledgerwriter-schema-binding-report-… |
r2 (this) | Verdict incl. scale/automation/non-hardcode/hybrid |
2. Coverage of GPT's required points (r1 + r2)
- A/B/C (r1): per-writer reconciliation (12 writers), code-change flags, SQL ops, principal, txn/rollback, invariants, no migration.
- D (r1): schema-contract tests; r3 row-count baseline preserved.
- Scale A · Non-hardcode B · Hybrid C · IU-centric D · Automation E (r2): in the two new docs — write amplification (~
10+U+Arows/IU; 1M IUs ⇒ 15M–1B+), fastest growers (manifest_unit_block,cut_change_set_affected_row,decision_backlog_history), pre-scale index list, deterministic keyset sweep cursor, bounded retry/idempotency, archival boundary; every mapping constant classified (protocol / config / schema-contract value / derived) with reject-list cleared (no IP/DSN/password/batch/collection literals); SQL=SSOT for identity/lifecycle/governance/audit/review/cut/verify/idempotency; JSONB normalization rule + queue (idempotency key first); vector store = rebuildable acceleration only; IU-centric writer map; automation readiness (resumable status, CAS concurrency guard, no manual runtime SQL, redacted logs, deferred queue contract).
3. Final verdict & answers (revised by r2)
- Is the code patch still sufficient? Yes for correctness, but its scope EXPANDS (still column-level binding — no flow/principal/isolation/state-machine change,
db_adapter.pyuntouched). The next code-authoring cycle must additionally: (1) centralize all binding vocabulary/sentinels in one module (non-hardcode); (2) makemark()idempotency lookup server-side filtered (today it is O(N) full-scan — a scale blocker); (3) make the sweep cursor a config-driven deterministic keyset scan; (4) add schema-contract tests covering columns + vocabulary. - Does schema migration remain unnecessary? For the PG-backed dry-run: YES — no structural migration. For production scale: an additive, index-only DDL cycle is required (FK/lookup indexes per scale-review A.3, plus one expression/generated index for the MARK idempotency key). Index-only,
CREATE INDEX CONCURRENTLY, no column/constraint/structure change to existingcutter_governancesemantics — and not a prerequisite for the single-IU dry-run. - Are indexes required before scale? YES (before scale, not before dry-run):
decision_backlog_history(entry_id[,changed_at]),cut_change_set(decision_backlog_entry_id),cut_change_set_affected_row(change_set_id),verify_result(change_set_id,prior_verify_result_id),review_decision(manifest_id,prior_review_decision_id),manifest_envelope(source_doc_ref),dot_pair_signature(prior_signature_id), partialdecision_backlog_entry(status), and the idempotency-key index. - Should JSONB stay JSONB or be normalized? Stay JSONB for v0.4/dry-run. Normalize on the documented rule (queried-at-scale / FK-needed / decision-driving / aggregated). Priority-1 graduation:
payload.idempotency_key→ indexed scalar before scale. Others remain JSONB until a query need is proven. - Can the PG-backed dry-run resume after the code patch? YES — after the (expanded) code cycle PASSes, resume with command-review r1 + verification-plan r3 unchanged (single-IU canonical, count-invariant; no index/migration needed for the dry-run itself).
- Exact next code-authoring scope:
cutter_agent/ledger.py— rebind the 9 MISMATCH row-builders; newcutter_agent/schema_binding.py(or equivalent) — centralized vocabulary/sentinel constants + lane/kind maps + deterministic key builders, all config-/version-derived;cutter_agent/phases.py— thread SB-DEC-5/6 args, replace InMemory_source_entrywith the SB-DEC-1 real-schema lineage join, makemark()idempotency lookup server-side filtered, config-driven keyset sweep cursor;tests/— static schema-contract fixture + per-writer contract tests + vocabulary-registry test + targeted InMemory-fixture updates. No change todb_adapter.py, state-machine, idempotency, signing, signal, cli. Nocutter_governancestructural migration. Indexes = separate later GPT-gated index-only DDL cycle (pre-scale, not pre-dry-run). - Git SSOT proof: Branch
main;/opt/incomex/dotHEAD =56d3732cb74d07546c938242180a434ed1067a9a(accepted, unchanged);git status --short -- iu-cutter= empty. No code change this phase ⇒ no commit needed. Tests 92/92 at56d3732.
4. Boundaries honoured
No code change · no commit · no dry-run · no env provision · no production connection except read-only catalog/information_schema/PK-FK inspection · no production row read beyond schema metadata · no secret/.env read · no deploy · no self-advance. Read-only grounding: full 12-table DDL + NOT-NULL/defaults + PK/FK/UNIQUE; accepted code at 56d3732. PROD system_identifier 7611578671664259111 and prod container untouched; 3 protected prior dry-run envs untouched.
5. Next gate
GPT review of this 9-doc package (r1 + r2 addendum). On PASS, open the LedgerWriter schema-binding code-authoring cycle (separate, GPT-gated, scope = §3 "exact next code-authoring scope"). The pre-scale index-only DDL cycle and any JSONB normalization are further separate GPT-gated cycles, not prerequisites for the dry-run. PG-backed dry-run remains BLOCKED until the code cycle PASSes; then resume with command-review r1 + verification-plan r3 unchanged. No self-advance.