KB-3C7A

IU Core 2000x — Qdrant bounded live reindex (6 points, 5 IUs)

6 min read Revision 1
iu-core2000xqdrantvector-syncbounded-reindexopenai-embeddings

02 — Qdrant bounded live reindex (6 points, 5 IUs)

1. Scope — discovery-driven, 5 IUs

The candidate set was chosen by live query against v_ui_iu_three_axis_envelope joined with unit_version (enacted only), ordered by body length. Goal: cover the distribution + force one chunk split.

canonical_address unit_id (prefix) body chars chunks
ICX-CONST/DIEU-18 5317a766…21ab1 7 1
ICX-CONST/DIEU-43 fb6f2197…1ecb 856 1
ICX-CONST/KT-A c53efee1…0b69 1518 1
ICX-CONST/KT-B d3ad5874…8278 1995 2
ICX-CONST/NT-13 a4c05403…fd92 347 1
total 4723 6

The split (DEFAULT_IU_CHUNK_CHARS = 1800) tests the chunker's per-IU boundary: KT-B -> iu.ICX-CONST/KT-B#chunk-0000 + iu.ICX-CONST/KT-B#chunk-0001, both sharing the same unit_id, distinct chunk_index, identical chunk_count=2.

2. Driver layout — boundary rule preserved end-to-end

The 2000x bounded reindex driver ran INSIDE incomex-agent-data (where keys live). It does — in order — within ONE bounded apply:

  1. Discover the Qdrant collection plan from v_iu_qdrant_collection_active (DB SSOT). Found iu_core_iu_chunks / dim 1536 / Cosine / openai:text-embedding-3-small / status=active.
  2. ensure_collection — already exists, no create.
  3. Fetch IU bodies via unit_version (lifecycle_status='enacted') JOINed to the envelope (canonical filter on the 5 addresses above).
  4. Build VectorPoint set via the canonical build_iu_point_set(unit_id, canonical_address, body, ...).
  5. assert_boundary(points) — application layer.
  6. Open gate: UPDATE dot_config SET value='true' WHERE key='iu_core.vector_sync_enabled'.
  7. apply_iu_set(plan, points, bodies, OpenAIEmbedder(model='text- embedding-3-small', dim=1536), connector, executor, actor, record_status='indexed') — embeds + upserts + records.
  8. Close gate (in finally) — even on error.
  9. Verify Qdrant via GET /collections/iu_core_iu_chunks (points_count) and POST /collections/.../points/scroll (payload shape).

3. Live evidence — indexed apply

gate_opened:        value=True
apply_iu_set:       planned=6 upserted=6 recorded=6 (boundary_passed=True) 0.37s
                    embedder=openai:text-embedding-3-small
gate_closed:        value=False
qdrant_collection_info: http=200 points_count=6
boundary_qdrant_payload: ok=True unique_unit_ids=5
registry_rows:      6 rows, sync_status='indexed' for all

4. Qdrant payload shape — boundary rule confirmed

Sample (UUIDv5-derived ids; human key in payload):

Qdrant id (uuid) unit_id (prefix) chunk_index chunk_count source_kind
0d8b19b2-7af2-5fcc… 5317a766 0 1 iu
62701f81-92ed-59e8… fb6f2197 0 1 iu
7ff3e5c2-7234-5a95… c53efee1 0 1 iu
ecb92b89-239f-5009… d3ad5874 0 2 iu
08c6793e-3564-57c1… d3ad5874 1 2 iu
95290a62-44f2-56c5… a4c05403 0 1 iu

Asserts (machine-checked): 5 unique unit_id across 6 points; every point source_kind='iu', summary_marker=false, has_axis_refs=true; 0 <= chunk_index < chunk_count; no payload field contains body text, secret, or another IU's content.

5. Defect-fixes discovered + closed in this macro

5.1 Qdrant point id must be UUID — HTTP 400 from live Qdrant

The 1500x QdrantConnector.upsert_points sent pt.point_key as the id field. Qdrant rejected: "Format error in JSON body: value iu.ICX-CONST/DIEU-18 is not a valid point ID, valid values are either an unsigned integer or a UUID".

Fix: new QDRANT_POINT_ID_NAMESPACE = "iu-core.qdrant.point-id.v1" constant + point_id_for(point_key) -> str helper (UUIDv5). upsert_points now sends point_id_for(pt.point_key) as the id; payload.point_key carries the human-readable key.

5.2 record_plan accepts status='indexed'

The 1500x record_plan refused status='indexed' even though apply_iu_set always calls record_plan. Fix: record_plan accepts the full label set. DB-side fn_iu_vector_sync_record_v2 remains the authoritative gate.

6. Five-layer impact

layer impact
PG 6 new rows in iu_vector_sync_point (status='indexed'); no DDL; gate toggled +closed
Qdrant iu_core_iu_chunks 0 -> 6 points (1536-dim, Cosine); production_documents untouched

7. Reversibility

  • Qdrant: DELETE /collections/iu_core_iu_chunks/points/{uuid} per point, or DELETE /collections/iu_core_iu_chunks for the whole collection.
  • PG: DELETE FROM iu_vector_sync_point WHERE sync_status='indexed' AND last_actor='iu_core_2000x_bounded_reindex'.
  • Gate: already closed.
  • Code: git revert 900d4c3 (re-introduces both defects).
Back to Knowledge Hub knowledge/dev/laws/dieu44-trien-khai/v0.6-iu-core-2000x-qdrant-reindex-directus-apply-external-pass-open-goal/02-qdrant-bounded-live-reindex.md