KB-4738

20B-P2 — Context-Pack KB Delete Manifest Dry-Run Report

17 min read Revision 1
vector-hygiene20b-p2delete-manifestdry-runcontext-packreport

20B-P2 — Context-Pack KB Delete Manifest (Dry-Run)

Date: 2026-05-05 | Status: PASS — manifest built, NO deletes executed Prior: 20B-P1 PASS (faucet closed; KB_MIRROR_ENABLED=false default) Scope: Inventory all context-pack/* KB docs, verify FS+PG recovery basis, propose delete options.


§1. Summary

Metric Value
Total KB docs under context-pack/ 1175
Drift from 1.174 (20A/20B-P0 baseline) +1 doc — explained by one additional 1-file build (PROJECT_MAP.md only) ingested between the inspection report and 20B-P1 patch (P1 closed the faucet at 13:17 UTC; the latest 1-file build is 20260505-100008-4277ba ≈ 10:00 UTC, so it slipped in)
Total builds 142
Builds with fs+pg recovery 131 / 142 (92.3%)
Builds with fs_only (no PG manifest) 10
Builds with pg_only 0
Builds with missing_recovery 120260418-023221-0bfb97 (8 docs, no FS, no PG)
Latest complete build (max sections) 20260504-040018-372b48 — 9 files, publish_status=live, kb_mirror_status=live, health_status=fail ⚠️
Non-context-pack docs in list 0 (clean — all 1175 docs match context-pack/<build_id>/<filename>)

Doc-level coverage:

Recovery class Doc count
fs+pg 1157
fs_only 10
pg_only 0
missing_recovery 8

§2. Recovery basis evidence

2.1 Filesystem (/opt/incomex/context-pack-staging/)

$ ls -d /opt/incomex/context-pack-staging/*/ | wc -l
141

First 3 / last 5:

20260418-023410-514793   ← oldest FS
20260418-023547-5e561f
20260418-024529-396098
…
20260504-220014-23d358
20260505-010009-231788
20260505-040008-32d221
20260505-070008-4860b7
20260505-100008-4277ba   ← newest FS (1-file build, no PG manifest)

2.2 PG manifest (context_pack_manifestcontext_pack_sections)

Note: context_pack_manifest has no build_id column. Build ID is derived from context_pack_sections.file_path via regexp_match(file_path, 'context-pack-staging/([^/]+)/').

131 rows (= 131 distinct manifest_id with ≥1 sections row)

All 131 PG-tracked builds have publish_status='live', has_checksum=true, and a non-zero section count (8 for early builds, 9 since OPS_CODE_INVENTORY introduced).

2.3 Cross-reference

Slice Count Note
KB ∩ FS ∩ PG 131 full recovery
KB ∩ FS only 10 partial 1-file builds 2026-05-04 → 2026-05-05 (PROJECT_MAP only; 7e uploaded 1 section, then 7f never INSERTed)
KB ∩ PG only 0
KB \ (FS ∪ PG) 1 20260418-023221-0bfb97 (FS purged, never made PG)
FS \ KB 0
PG \ KB 0

§3. Build summary table — boundaries

First 5 builds (oldest):

build_id kb_docs fs_exists pg_manifest pg_sections recovery_status is_latest_complete
20260418-023221-0bfb97 8 false false 0 missing_recovery false
20260418-023410-514793 8 true true 8 fs+pg false
20260418-023547-5e561f 8 true true 8 fs+pg false
20260418-024529-396098 8 true true 8 fs+pg false
20260418-035044-63bfa0 8 true true 8 fs+pg false

Last 15 builds (most recent — shows the partial-build tail post 7e/7f split):

build_id kb_docs fs_exists pg_manifest pg_sections recovery_status is_latest_complete
20260503-160011-441bfa 9 true true 9 fs+pg false
20260503-190015-0370f0 9 true true 9 fs+pg false
20260503-220012-7cf6d3 9 true true 9 fs+pg false
20260504-010008-76424e 9 true true 9 fs+pg false
20260504-040018-372b48 9 true true 9 fs+pg true
20260504-070013-022f97 1 true false 0 fs_only false
20260504-100013-746ffe 1 true false 0 fs_only false
20260504-130008-0bc5fd 1 true false 0 fs_only false
20260504-160010-68f078 1 true false 0 fs_only false
20260504-190009-67ddf6 1 true false 0 fs_only false
20260504-220014-23d358 1 true false 0 fs_only false
20260505-010009-231788 1 true false 0 fs_only false
20260505-040008-32d221 1 true false 0 fs_only false
20260505-070008-4860b7 1 true false 0 fs_only false
20260505-100008-4277ba 1 true false 0 fs_only false

(Middle 122 builds omitted — all fs+pg, full section count.)

3.1 Anomaly: 1-file builds 2026-05-04 → 2026-05-05

Every build from 20260504-070013-022f97 onward has exactly 1 KB doc (PROJECT_MAP.md) and no PG manifest row. Pattern: 7e succeeded for the first section, then aborted before completing the remaining 7 sections and before 7f wrote context_pack_manifest. This corroborates the 20A/20B-P0 finding that publish was failing post-rev11; 20B-P1 has now sealed the upload step entirely so this pattern will not recur.

3.2 Anomaly: latest complete build health=fail

20260504-040018-372b48 is the latest build with all 9 sections AND a PG manifest row, but its health_status='fail'. KB mirror itself is live. This means the build's content was published, but post-publish health checks failed. Worth flagging for Option B below — keeping a known-fail build as recovery anchor is suboptimal; recommend keeping the most recent healthy/warn complete build instead.

Most recent complete build with health ∈ (healthy, warn): 20260504-010008-76424e (warn).


§4. Document manifest — top 20 + bottom 20

(Full 1175-row TSV staged on VPS at /opt/incomex/backups/20b-p2-full-manifest.tsv — header document_id\tbuild_id\tfilename\tdelete_candidate\treason\trecovery_status\twould_keep_if_latest_only.)

Top 20 (oldest)

document_id build_id filename delete_candidate reason recovery_status would_keep_if_latest_only
context-pack/20260418-023221-0bfb97/ARCHITECTURE.mmd 20260418-023221-0bfb97 ARCHITECTURE.mmd true KB mirror reader removed; NO recovery basis (FS+PG both missing) — flagged missing_recovery false
context-pack/20260418-023221-0bfb97/DB_MAP.md 20260418-023221-0bfb97 DB_MAP.md true KB mirror reader removed; NO recovery basis (FS+PG both missing) — flagged missing_recovery false
context-pack/20260418-023221-0bfb97/DOT_REGISTRY.md 20260418-023221-0bfb97 DOT_REGISTRY.md true KB mirror reader removed; NO recovery basis (FS+PG both missing) — flagged missing_recovery false
context-pack/20260418-023221-0bfb97/ENTITIES_OVERVIEW.md 20260418-023221-0bfb97 ENTITIES_OVERVIEW.md true KB mirror reader removed; NO recovery basis (FS+PG both missing) — flagged missing_recovery false
context-pack/20260418-023221-0bfb97/LAWS_INDEX.md 20260418-023221-0bfb97 LAWS_INDEX.md true KB mirror reader removed; NO recovery basis (FS+PG both missing) — flagged missing_recovery false
context-pack/20260418-023221-0bfb97/PROJECT_MAP.md 20260418-023221-0bfb97 PROJECT_MAP.md true KB mirror reader removed; NO recovery basis (FS+PG both missing) — flagged missing_recovery false
context-pack/20260418-023221-0bfb97/RED_ZONES.md 20260418-023221-0bfb97 RED_ZONES.md true KB mirror reader removed; NO recovery basis (FS+PG both missing) — flagged missing_recovery false
context-pack/20260418-023221-0bfb97/project-map.json 20260418-023221-0bfb97 project-map.json true KB mirror reader removed; NO recovery basis (FS+PG both missing) — flagged missing_recovery false
context-pack/20260418-023410-514793/ARCHITECTURE.mmd 20260418-023410-514793 ARCHITECTURE.mmd true KB mirror reader removed by 20B-P1; superseded by FS+PG SSOT fs+pg false
context-pack/20260418-023410-514793/DB_MAP.md 20260418-023410-514793 DB_MAP.md true KB mirror reader removed by 20B-P1; superseded by FS+PG SSOT fs+pg false
context-pack/20260418-023410-514793/DOT_REGISTRY.md 20260418-023410-514793 DOT_REGISTRY.md true KB mirror reader removed by 20B-P1; superseded by FS+PG SSOT fs+pg false
context-pack/20260418-023410-514793/ENTITIES_OVERVIEW.md 20260418-023410-514793 ENTITIES_OVERVIEW.md true KB mirror reader removed by 20B-P1; superseded by FS+PG SSOT fs+pg false
context-pack/20260418-023410-514793/LAWS_INDEX.md 20260418-023410-514793 LAWS_INDEX.md true KB mirror reader removed by 20B-P1; superseded by FS+PG SSOT fs+pg false
context-pack/20260418-023410-514793/PROJECT_MAP.md 20260418-023410-514793 PROJECT_MAP.md true KB mirror reader removed by 20B-P1; superseded by FS+PG SSOT fs+pg false
context-pack/20260418-023410-514793/RED_ZONES.md 20260418-023410-514793 RED_ZONES.md true KB mirror reader removed by 20B-P1; superseded by FS+PG SSOT fs+pg false
context-pack/20260418-023410-514793/project-map.json 20260418-023410-514793 project-map.json true KB mirror reader removed by 20B-P1; superseded by FS+PG SSOT fs+pg false
context-pack/20260418-023547-5e561f/ARCHITECTURE.mmd 20260418-023547-5e561f ARCHITECTURE.mmd true KB mirror reader removed by 20B-P1; superseded by FS+PG SSOT fs+pg false
context-pack/20260418-023547-5e561f/DB_MAP.md 20260418-023547-5e561f DB_MAP.md true KB mirror reader removed by 20B-P1; superseded by FS+PG SSOT fs+pg false
context-pack/20260418-023547-5e561f/DOT_REGISTRY.md 20260418-023547-5e561f DOT_REGISTRY.md true KB mirror reader removed by 20B-P1; superseded by FS+PG SSOT fs+pg false
context-pack/20260418-023547-5e561f/ENTITIES_OVERVIEW.md 20260418-023547-5e561f ENTITIES_OVERVIEW.md true KB mirror reader removed by 20B-P1; superseded by FS+PG SSOT fs+pg false

Latest complete build (would-keep set under Option B)

document_id build_id filename delete_candidate reason recovery_status would_keep_if_latest_only
context-pack/20260504-040018-372b48/ARCHITECTURE.mmd 20260504-040018-372b48 ARCHITECTURE.mmd true KB mirror reader removed by 20B-P1; superseded by FS+PG SSOT fs+pg true
context-pack/20260504-040018-372b48/DB_MAP.md 20260504-040018-372b48 DB_MAP.md true KB mirror reader removed by 20B-P1; superseded by FS+PG SSOT fs+pg true
context-pack/20260504-040018-372b48/DOT_REGISTRY.md 20260504-040018-372b48 DOT_REGISTRY.md true KB mirror reader removed by 20B-P1; superseded by FS+PG SSOT fs+pg true
context-pack/20260504-040018-372b48/ENTITIES_OVERVIEW.md 20260504-040018-372b48 ENTITIES_OVERVIEW.md true KB mirror reader removed by 20B-P1; superseded by FS+PG SSOT fs+pg true
context-pack/20260504-040018-372b48/LAWS_INDEX.md 20260504-040018-372b48 LAWS_INDEX.md true KB mirror reader removed by 20B-P1; superseded by FS+PG SSOT fs+pg true
context-pack/20260504-040018-372b48/OPS_CODE_INVENTORY.md 20260504-040018-372b48 OPS_CODE_INVENTORY.md true KB mirror reader removed by 20B-P1; superseded by FS+PG SSOT fs+pg true
context-pack/20260504-040018-372b48/PROJECT_MAP.md 20260504-040018-372b48 PROJECT_MAP.md true KB mirror reader removed by 20B-P1; superseded by FS+PG SSOT fs+pg true
context-pack/20260504-040018-372b48/RED_ZONES.md 20260504-040018-372b48 RED_ZONES.md true KB mirror reader removed by 20B-P1; superseded by FS+PG SSOT fs+pg true
context-pack/20260504-040018-372b48/project-map.json 20260504-040018-372b48 project-map.json true KB mirror reader removed by 20B-P1; superseded by FS+PG SSOT fs+pg true

Bottom 20 (latest — including the 1-file partial tail)

(See VPS TSV for full row — abbreviated here, all rows have delete_candidate=true, would_keep_if_latest_only=false except the 9 rows above.)

context-pack/20260504-010008-76424e/project-map.json                       fs+pg
context-pack/20260504-040018-372b48/{9 files}                              fs+pg   ← latest complete
context-pack/20260504-070013-022f97/PROJECT_MAP.md                         fs_only
context-pack/20260504-100013-746ffe/PROJECT_MAP.md                         fs_only
context-pack/20260504-130008-0bc5fd/PROJECT_MAP.md                         fs_only
context-pack/20260504-160010-68f078/PROJECT_MAP.md                         fs_only
context-pack/20260504-190009-67ddf6/PROJECT_MAP.md                         fs_only
context-pack/20260504-220014-23d358/PROJECT_MAP.md                         fs_only
context-pack/20260505-010009-231788/PROJECT_MAP.md                         fs_only
context-pack/20260505-040008-32d221/PROJECT_MAP.md                         fs_only
context-pack/20260505-070008-4860b7/PROJECT_MAP.md                         fs_only
context-pack/20260505-100008-4277ba/PROJECT_MAP.md                         fs_only   ← newest doc, post-baseline drift +1

§5. Validation results

Check Result
All document_id start with context-pack/ PASS (1175/1175)
Count vs 1.174 baseline +1 drift explained (1 new 1-file build between baseline and P1 patch)
Recovery coverage by build (fs+pg) 131 / 142 = 92.3%
Recovery coverage by doc (fs+pg) 1157 / 1175 = 98.5%
missing_recovery builds 1 (20260418-023221-0bfb97, 8 docs) — flagged
fs_only builds 10 (1 doc each, partial-build pattern) — recoverable from FS alone
Non-context-pack docs in list 0

§6. Decision — options for GPT/User

Option A — Delete ALL 1175 docs

  • Coverage: 1157/1175 docs have fs+pg recovery; 10 have fs_only (still recoverable); 8 have missing_recovery.
  • Risk: the 8 missing_recovery docs (20260418-023221-0bfb97) are not recoverable if deleted. These belong to one ~17-day-old build whose FS staging has been purged and which never got a PG manifest row.
  • Mitigation if Option A chosen: before delete, snapshot the 8 missing_recovery docs to a frozen archive path (e.g. archive/context-pack-orphans/20260418-023221-0bfb97/*) so content is preserved out-of-band of the search index.

Option B — Delete all except latest complete build (1166 docs deleted, 9 kept)

  • Keep set: the 9 docs of 20260504-040018-372b48 (max sections = 9, latest among full builds).
  • Caveat: that build has health_status='fail'. If a healthy anchor is preferred, switch the keep-set to the latest complete build with health ∈ (healthy, warn): 20260504-010008-76424e (warn). Same scope (9 docs), better health.
  • Same missing_recovery mitigation as Option A still applies (those 8 are not in the keep-set).

Recommendation

Option B with keep-set switched to 20260504-010008-76424e (latest complete build with health ∈ (healthy, warn)), plus a one-time archive snapshot of the 8 missing_recovery docs from 20260418-023221-0bfb97 before any delete.

Rationale:

  • Keeping a single well-known anchor build in KB lets a future operator validate "rebuild produces same content as KB anchor" without re-enabling 7e.
  • Anchoring to a fail-health build is a footgun; warn is the highest health currently available among complete builds.
  • Archiving the 8 orphan-build docs eliminates the only data-loss risk in this whole exercise at trivial cost.

If GPT/User prefers a clean cut, Option A + archive of the 8 orphans is also acceptable. The 10 fs_only 1-file docs can be deleted unconditionally — FS staging is the SSOT for those.


§7. Artifacts

Artifact Location
Full delete manifest (1175 rows TSV) vps:/opt/incomex/backups/20b-p2-full-manifest.tsv
FS build list vps:/tmp/fs_builds.txt (141 lines)
PG build snapshot (TSV) vps:/tmp/pg_builds.tsv (131 lines)

§8. Hard boundaries observed

  • No deleteDocument called.
  • No deindex.
  • No DOT/Đ43 patch.
  • No cron / no service restart.
  • No filesystem cleanup.
  • No build run.

§9. Next

  • 20B-P3 — execute deletes (separate prompt, separate approval). Inputs needed: chosen option (A vs B), chosen anchor build_id (if B), pre-delete archive policy for the 8 missing_recovery docs.
  • Đ43/20C — patch kb_mirror_status semantics + investigate the 1-file partial-build pattern (7e succeeded once then 7f never INSERTed for 10 consecutive builds across May 4–5; root cause likely tied to whatever published the lone PROJECT_MAP.md before the rest of the pipeline aborted).

20B-P2 Report | 2026-05-05 | Dry-run only. No deletes. Awaiting 20B-P3 dispatch.

Back to Knowledge Hub knowledge/dev/laws/dieu44-trien-khai/reports/20b-p2-context-pack-kb-delete-manifest-dryrun-report.md