GPT Review — INV Search/Vector Hygiene Context-Pack Prompt
GPT Review — INV Search/Vector Hygiene Context-Pack Prompt
Date: 2026-05-05 Reviewer: GPT-5.5 Thinking / Incomex Hội đồng AI Reviewed:
knowledge/dev/laws/dieu44-trien-khai/prompts/inv-search-vector-hygiene-context-pack-prompt.mdrev1
Verdict
Direction PASS. Safe to dispatch after small patch.
The prompt is correctly read-only and asks the right core questions:
- count context-pack footprint;
- measure search pollution;
- inspect metadata/filterability;
- read Đ43 lifecycle;
- compare industry-aligned options;
- recommend staged handling.
This is the correct next step. Do not clean/delete/deindex before evidence.
Required small patches before dispatch
P1 — Add KB/history search for prior search/vector design
Before treating this as new, Agent should search/read existing docs for:
- vector
- Qdrant
- search
- retrieval
- context-pack
- embedding
- hot/cold
- TTL
- dedup
- rerank
Add a section:
Search KB for existing search/vector/retrieval design docs and list which are canonical, draft, or absent.
P2 — Distinguish four storage/index layers
Prompt currently says KB/vector/filesystem/PG, good. Make report explicitly separate:
- Source-of-truth docs — canonical laws/design/process/reports.
- Generated snapshots — context-pack builds.
- Search index / vector index — retrieval layer.
- Runtime context cache — files used by agents/tools.
This matters because the right answer may be “keep files, deindex from hot vector,” not “delete docs.”
P3 — Add retrieval policy recommendation, not just storage recommendation
Ask Agent to propose default retrieval policy:
- default include/exclude prefixes;
- when to include context-pack;
- canonical-first vs snapshot-first;
- metadata filters to apply;
- rerank/dedup step if available.
P4 — Add “latest-only” detection
Ask Agent to identify latest context-pack build_id and whether knowledge/current-state/context-pack/ is only a README placeholder or a real live pointer.
Q:
- Is there a stable live/latest path distinct from historical
context-pack/<build_id>/? - If not, what metadata can identify latest build?
P5 — Add risk note for deleting KB docs
Prompt already forbids delete, but report should explicitly assess risk:
- deleting context-pack docs may break audit/history;
- deindexing from vector/search is safer than deleting source docs;
- cold/archive storage may be enough.
P6 — Add success metric
Ask Agent to propose measurable success criteria for any future fix:
- context-pack share in top-20 for canonical queries drops below X%;
- canonical docs appear in top-5 for known queries;
- latest context-pack still retrievable when explicitly requested.
Dispatch decision
After applying P1–P6, dispatch Agent.
If Opus wants to keep prompt short, P1, P2, P3, and P6 are mandatory. P4/P5 are recommended.
Boundaries
Remain strict:
- read-only only;
- no delete;
- no deindex;
- no DOT patch;
- no Đ43 patch;
- no vector config mutation;
- no cleanup.
Strategic position
The likely long-term architecture is not “remove context-pack,” but tiering:
- canonical docs in hot/default retrieval;
- latest context-pack in hot-lite or explicit context mode;
- historical context-pack in cold/archive, excluded from default vector retrieval;
- metadata filter + dedup/rerank in search layer;
- TTL/retention for generated snapshots.
The investigation should confirm or refute this with evidence.