KB-6F7A

Branch B — Vector / No-Cross-IU-Vector Audit (2026-05-29)

8 min read Revision 1
iuvectorqdrantno-cross-iubranch-bdieu392026-05-29

Branch B — Vector / No-Cross-IU-Vector Audit

Doc 02 (2026-05-29)

Verdict: PARTIAL-PASS. The no-cross-IU-vector boundary is structurally proven at the PG control plane, and the policy gate is intact (never-flip false). BUT the vector subsystem is not greenfield — a historical indexing campaign materialized 149 per-IU points, the PG registry and the Qdrant collection note disagree on actual point count, and reindex did not flow through a DOT command (which the design requires). Net: isolation is safe; observability/reconciliation and DOT-coverage are the gaps.

1. Policy gate (PASS)

  • iu_core.vector_sync_enabled = false — declared NEVER-FLIP; fn_iu_gate_verify_closed() returns never_flip_intact=true, is_safe=true.
  • fn_iu_vector_sync_enabled() is a STABLE reader of that dot_config key (returns false).
  • Only ONE vector key in dot_config (iu_core.vector_sync_enabled=false); no hidden enable flag.
  • Design corroboration: REQ §4 #12, DESIGN §14, REV2-03 §3.4, REV2-00 inv.20 all mandate "1 IU = ≥1 point, never cross-IU merge" and "design assumes off until governance flips."

2. Vector registry surface (machinery EXISTS — not empty)

Object Type Rows Note
iu_vector_sync_point table 152 per-IU sync ledger (point_key, unit_id, parent_piece_id, chunk_index, chunk_count, content_digest, indexed_digest, sync_status, point_count, axis_refs)
iu_qdrant_collection_registry table 1 iu_core_iu_chunks, status=active, dim=1536 Cosine, openai:text-embedding-3-small, purpose=iu_core_per_iu_chunks, notes "0 points" (2026-05-23)
collection_registry_vector_policy table 2 per-collection eligibility flags
v_iu_vector_sync_status view 152 status projection
v_iu_qdrant_collection_active view 1 the active collection
v_collection_vector_eligibility view 168 = collection_registry count

Functions: fn_iu_vector_sync_enabled, fn_iu_vector_sync_record, fn_iu_vector_sync_record_v2, fn_iu_qdrant_collection_register, fn_iu_qdrant_collection_retire.

3. No-cross-IU-pollution — PROVEN (control plane)

Three independent structural proofs:

  1. Write-time guard (defence in depth): fn_iu_vector_sync_record_v2 raises check_violation if source_kind='iu' and any of unit_id / chunk_index / chunk_count is NULL — "per-IU vector point requires unit_id + chunk_index + chunk_count". A backing CHECK constraint mirrors this. ⇒ An iu point cannot exist without exactly one owning unit_id.
  2. Distinctness query (live): count(point_key)=152, count(DISTINCT point_key)=152, and point_keys_spanning_multi_iu = 0 — no point_key maps to more than one unit_id.
  3. Per-IU chunk model: 149 iu-kind rows over 141 distinct units (multi-chunk IUs allowed), chunk_index < chunk_count holds for all, unit_id IS NULL count = 0 for iu rows. The 3 non-IU rows are dryrun (collection×2, corpus×1, all unit_id NULL, never indexed).

The "1 IU = ≥1 point, no mix" law (Đ39 / DESIGN §14) is enforced both by code and by schema. Cut/reconstruct/compose cannot cross IU vector boundaries because the sync function refuses any iu point without a single unit_id, and there is no code path that writes a multi-IU point.

4. The gaps (why PARTIAL, not full PASS)

# Finding Evidence Impact
B-G1 Not greenfield — 149 points historically "indexed" sync_status='indexed' on 149 rows, indexed_digest set, total_points=149; window 2026-05-22→24; actors iu_core_2400x_full_reindex, iu-core-9000x/qdrant_onboarding, vector_sync_cli A prior reindex campaign ran before the pilot froze vector sync. The "vector is untouched" mental model is wrong; the gate froze a system that had already been indexed once.
B-G2 PG↔Qdrant point-count drift, unverifiable read-only PG ledger: 149 indexed points. Qdrant registry note: "0 points (status=green)" dated 2026-05-23 (one day before last PG index 2026-05-24). MCP exposes no Qdrant query tool. Cannot confirm whether Qdrant actually holds 149 points, 0, or some other number. Đ31 ("every divergence is an error") is unsatisfiable here without a reconcile path.
B-G3 Reindex bypassed DOT DESIGN §14: "reindex via DOT command." Live: indexing actors are CLI (vector_sync_cli) and campaign tags, not a dot_iu_* command (no vector command in dot_iu_command_catalog). Reindex is not audited through the DOT run-log (dot_iu_command_run); violates NT3/Đ35 single-gate principle for a mutating-ish op.
B-G4 No reconcile DOT / health signal No function matching reconcile touches vector; fn_reconcile_* cover labels/FK/rules only. No Tier-A check continuously proves vector_sync_enabled ⟺ Qdrant state ⟺ PG ledger.
B-G5 fn_iu_vector_sync_record_v2 is not gate-guarded Function head validates inputs + per-IU boundary but does not call fn_iu_vector_sync_enabled(); it records a ledger row regardless of the gate (the gate governs the external CLI push, not the PG record). A ledger write can occur while the gate is "off"; acceptable for a registry, but should be explicit so the gate's meaning is unambiguous.

5. Exact verification queries (reproduce)

-- gate intact
SELECT * FROM fn_iu_gate_verify_closed();          -- never_flip_intact=true, vector_sync_enabled=false
SELECT key,value FROM dot_config WHERE key ~* 'vector|qdrant|embed|sync';  -- only iu_core.vector_sync_enabled=false

-- no cross-IU pollution
SELECT count(*) total, count(DISTINCT point_key) keys,
  (SELECT count(*) FROM (SELECT point_key FROM iu_vector_sync_point
     WHERE unit_id IS NOT NULL GROUP BY point_key HAVING count(DISTINCT unit_id)>1) x) AS multi_iu
FROM iu_vector_sync_point;                          -- 152 / 152 / 0

-- status + historical index
SELECT sync_status, count(*), count(DISTINCT unit_id), sum(point_count),
       min(updated_at), max(updated_at), string_agg(DISTINCT last_actor,', ')
FROM iu_vector_sync_point GROUP BY sync_status;     -- indexed 149 (141 units, 2026-05-22..24), dryrun 3

-- the active collection
SELECT collection_name, status, vector_dim, distance_metric, notes
FROM iu_qdrant_collection_registry;                 -- iu_core_iu_chunks / active / 1536 / Cosine / "0 points"

6. Required fixes (priority)

  1. [P1] DOT-ify vector reindex — add dot_iu_vector_reindex (Tier-B) + dot_iu_vector_verify (Tier-A) so any future indexing is gated by iu_core.vector_sync_enabled and audited in dot_iu_command_run.
  2. [P1] PG↔Qdrant reconcile — add fn_iu_vector_reconcile() + DOT comparing iu_vector_sync_point (status=indexed, sum point_count) against the live Qdrant collection count; surface divergence to system_issues (Đ31). Needs a read path to Qdrant (apply-channel or a read-only Qdrant tool) — cannot be closed from context_pack_readonly alone.
  3. [P2] Document the 149 historical points in the vector boundary KB so future audits don't misread the state as greenfield.
  4. [P2] Make the gate-check explicit in fn_iu_vector_sync_record_v2 (record-with-warning vs refuse) so the semantics of vector_sync_enabled vs ledger writes are unambiguous.

7. Bottom line

No cross-IU vector pollution exists and cannot be introduced by the current code paths — the strongest possible structural guarantee (code + CHECK + distinct-key proof). The pilot's vector freeze is intact. The work remaining is observability and DOT-governance of the vector plane, not a safety breach.

Back to Knowledge Hub knowledge/dev/reports/architecture/iu-design-live-gap-dot-ops-workflow-design-registry-audit-2026-05-29/02-vector-no-cross-iu-vector-audit.md