KB-709E

IU Core 1k+ — 02 Vector boundary enforcement + sync hardening

3 min read Revision 1
iu-coredieu44v0.61k-plusvector-boundaryqdrantmigration-020sandbox-1902026-05-23

02 — Vector boundary enforcement + sync hardening

1. The gap closed

1k built the vector-sync substrate but did NOT enforce the binding per-IU vector-boundary rule at the database level. This macro closes that gap with migration 020 — 6 boundary cols + 1 CHECK + 1 v2 record function + boundary helpers in code.

2. Migration 020 — boundary CHECK + v2 record function (durable, applied)

On iu_vector_sync_point: new cols unit_id (req for source_kind='iu'), parent_piece_id, chunk_index/chunk_count (req for per-IU; 0<=idx<count), summary_marker (req true for collection/corpus), axis_refs jsonb. CHECK iu_vector_sync_point_boundary_chk NOT VALID grandfathers pre-020 rows; per-IU rows require unit_id+chunk fields+no summary_marker; summary rows require summary_marker=true+no unit_id. New fn_iu_vector_sync_record_v2(...) enforces the rule at the function layer (clear errors, ERRCODE check_violation). V1 PRESERVED — sandbox/170 + prior 1k tests pass unmodified.

3. vector_sync.py boundary helpers (code-level)

chunk_iu_body(text, max_chars=1800) newline-aware greedy fill INSIDE one IU only; build_iu_point_set(unit_id, canonical_address, body, ...) is the ONLY way to mint per-IU points (same unit_id shared, per-chunk digest = sha256 of own chunk text → small edit reindexes one chunk); assert_boundary(points) runs the rule across a batch (rejects per-IU missing unit_id, per-IU marked summary, summary missing marker, summary with unit_id, duplicate (unit_id,chunk_index), out-of-range chunk_index, disagreeing chunk_count). record_plan routes per-IU + summary-marked rows through v2; legacy through v1; IU body never enters SQL string.

4. sandbox/190 — vector boundary probe — 8/8 PASS

T1 v1 grandfather works (regress); T2 v2 per-IU success; T3 missing unit_id REFUSED; T4 per-IU summary_marker=true REFUSED; T5 collection without summary_marker REFUSED; T6 collection with unit_id REFUSED; T7 chunk_index out of range REFUSED; T8 unknown source_kind REFUSED. All raise check_violation at DB layer.

5. sandbox/170 — vector-sync probe (regress) — 7/7 PASS

3 pre-020 dryrun rows in production (iu-tree._collections.document/file/_corpus) keep working as grandfather rows; CHECK is NOT VALID so existing rows unaffected.

6. Qdrant external apply — exact blocker recorded

Container reachable at qdrant:6333. Secrets present (QDRANT_LOCAL_API_KEY, OPENAI_API_KEY) — no value read/logged. External-only remaining: IU Core collection name (separate from shared production_documents), embedder wiring, durable opening of iu_core.vector_sync_enabled for bounded reindex.

7. Five-layer impact

PG additive+reversible (6 cols+CHECK+1 fn+1 view). Directus none (view is Directus-ready but unsubscribed). Nuxt none. AgentData 7 reports. Qdrant: connector boundary tightened, no external write.

Back to Knowledge Hub knowledge/dev/laws/dieu44-trien-khai/v0.6-iu-core-1k-plus-vector-ui-assembly-acceptance-open-goal/02-vector-boundary-and-sync.md