9000x-onboarding · 02 — Driver build (run_onboarding.py + qdrant_onboarding_driver.py) + 20 tests
9000x — Driver build + tests
Module layout
| File | Role |
|---|---|
cutter_agent/iu_core/qdrant_onboarding_driver.py |
runtime driver (importable) |
ops/qdrant-onboarding-package-8000x/run_onboarding.py |
Mac CLI (psql-over-ssh) |
ops/qdrant-onboarding-package-8000x/in_container_run.py |
agent-data CLI (psycopg2) |
ops/qdrant-onboarding-package-8000x/__init__.py |
package marker |
tests/test_iu_core_9000x_qdrant_onboarding_driver.py |
20 tests |
The 8000x EXACT_GAP closure: the ops/qdrant-onboarding-package-8000x/
directory previously contained only README.md referencing a non-existent
run_onboarding.py. This macro ships the driver and the README is rewritten
to match the actual files.
CLI shape
Three modes mutually exclusive:
--dry-run— preflight + plan; never opens the gate, never embeds, never upserts. Pure read.--apply— bounded apply: openiu_core.vector_sync_enabledgate, embed- upsert + record under
record_status='indexed', close gate. The open/close pair is wrapped intry/finallyso a crash mid-flight still closes the gate.
- upsert + record under
--rollback— queryiu_vector_sync_pointforlast_actor=<actor>, derive deterministic Qdrant point ids viavector_sync.point_id_for(point_key), delete by id, truncate matching PG rows. Refuses empty actor (which would otherwise wipe the collection).
Flags: --docs DIEU-28,DIEU-32,DIEU-35, --actor <tag>,
--collection iu_core_iu_chunks, --limit <n>,
--empty-body-policy skip|index (default skip),
--no-verify-boundary (debug), --emit-json.
Exit codes: 0 ok, 2 usage, 3 refused (production_documents, draft IU, boundary violation, empty rollback actor), 4 runtime IO failure.
Tests (20/20 PASS)
| Group | Test |
|---|---|
| TestSelectAndPlan | test_select_requires_doc_codes |
| TestSelectAndPlan | test_build_candidate_plan_skips_empty_bodies |
| TestSelectAndPlan | test_build_candidate_plan_keeps_empty_under_index_policy |
| TestSelectAndPlan | test_build_candidate_plan_rejects_unknown_policy |
| TestSelectAndPlan | test_long_body_splits_into_multi_chunk_same_iu |
| TestSelectAndPlan | test_axis_refs_populated |
| TestDryRun | test_dry_run_does_not_touch_qdrant_and_keeps_gate_shut |
| TestDryRun | test_dry_run_refuses_draft_iu |
| TestDryRun | test_dry_run_refuses_production_documents_collection |
| TestApply | test_apply_opens_and_closes_gate |
| TestApply | test_apply_closes_gate_even_when_qdrant_fails |
| TestApply | test_apply_refuses_production_documents |
| TestApply | test_apply_requires_actor |
| TestRollback | test_rollback_refuses_empty_actor |
| TestRollback | test_rollback_refuses_production_documents |
| TestRollback | test_rollback_deletes_by_point_id |
| TestRollback | test_rollback_noop_when_no_rows |
| TestReadmePackage | test_run_onboarding_file_exists |
| TestReadmePackage | test_driver_module_importable |
| TestReadmePackage | test_readme_mentions_actual_driver_filename |
Full suite: 1212 → 1232 PASS (+20).
Discovered/repaired defects mid-build
-
psql line-per-row parser corruption on multi-line body text. The psql
-At -Ftext format splits on\n, so any IU body containing newlines collapsed into single-token tuples and was dropped by theif len(r) < 6: continueguard. Fixed by base64-encodingbodyin the SQL projection ('b64:'||replace(encode(convert_to(body,'UTF8'), 'base64'), E'\n', '')) and decoding in the driver (_decode_body_field). Detected because a Mac-side dry-run reported 25 candidates instead of 86 — caught BEFORE any apply. -
Rollback by payload-filter
actorwould have been a no-op. The first design calledPOST /points/deletewithfilter.must.actor=<actor>— butVectorPoint.payload()does NOT include anactorfield. A live rollback would have returned 200 withpoints_deleted=0while leaving the points in place. Fixed by point-id deletion derived fromiu_vector_sync_point.point_key+vector_sync.point_id_for. Detected via a payload-shape smoke scroll after live apply. -
cutter_agent/iu_core/__init__.pyauto-imports every submodule. Shipping only the three submodules our driver needs into theincomex-agent-datacontainer triggered a partial-init ImportError. Fixed by bundling a minimal shim__init__.py(no auto-imports) in the in-container deployment tarball; the repo's__init__.pyis untouched.