KB-5340
Codex audit — Question Catalog + Dependency Map + Operational Risk Gates — 2026-06-15
25 min read Revision 1
laws-newreportcodexauditquestion-catalogdependency-mapoperational-riskhold2026-06-15
Codex Audit — Question Catalog + Dependency Map + Operational Risk Gates
Date: 2026-06-15
Scope: Read-only audit of knowledge/dev/laws-new/cau-hoi-khi-tai-cau-truc.md and related drafts.
Status: HOLD
Mutation boundary: No production/DB mutation; no checker/pilot/registry/table/collection/DOT/index/worker creation; no edit to knowledge/dev/laws/ or knowledge/dev/laws-new/.
Progress
- Step 0: Read
.claude/skills/incomex-rules.md. - Step 1: Read task and mandatory KB laws in main process.
- Step 2: Read Question Catalog and related laws-new drafts from Agent Data KB.
- Step 3: Compare against current checkout evidence and identify dependency/gate risks.
- Step 4: Complete coverage, reuse, modularity, implementation, operation, and gate audits.
- Step 5: Verify report evidence and finalize.
Mandatory Sources Read
.claude/skills/incomex-rules.md— 36 rules / 8 steps.knowledge/dev/ssot/operating-rules.md— v7.58, viasearch_knowledge("operating rules SSOT").knowledge/dev/laws/constitution.md— current KB result v4.6.3, viasearch_knowledge("hiến pháp v4.0 constitution").knowledge/dev/laws-new/cau-hoi-khi-tai-cau-truc.md— KB revision 59.knowledge/dev/laws-new/de-bai-cai-tien.md— KB revision 33.knowledge/dev/laws-new/required-stamps.v0.1.json— KB revision 6.knowledge/dev/laws-new/promote-checker-v0.1-spec.md— KB revision 11.knowledge/dev/laws-new/matrix-stamp-governance-addendum.md— KB revision 14.knowledge/dev/laws-new/matrix-refactor-quick-rules.md— KB revision 8.knowledge/dev/laws-new/matrix-refactor-implementation-plan.md— KB revision 5.knowledge/dev/laws-new/roadmap-cai-tien.md— KB revision 1.
Evidence Snapshot
- Current checkout does not contain the referenced
knowledge/dev/laws-new/*files. Attempts towc/rgthose paths returnedNo such file or directory; the drafts were read from Agent Data KB. - Current checkout contains substantial reusable substrate evidence:
meta_catalog,collection_registry,dot_tools,birth_registry,universal_edges,system_issues, counting/integrity scanners, Directus snapshot, and smoke tests. - The draft set contains internal inconsistencies that the survey must resolve before design:
- Quick Rules S3 says every stamp is an
inspect_*column onbirth_registry, while Addendum §2b and required-stamps split pre-promote stamps into staging metadata. - Quick Rules rule 17 includes DOT check inside IO Contract, while
de-bai-cai-tien.mddefines the minimal five-field IO boundary and explicitly keeps DOT/evidence outside when they cause bloat. - Drafts use a six-level assembly model while the mandatory skill/constitution baseline describes seven composition levels; the catalog recognizes this as a conflict but does not make document-authority resolution an explicit pre-survey gate.
- Quick Rules S3 says every stamp is an
Three Declarations
- Permanent: The audit recommends fixing decision/gate wording, source authority, and per-slice dependencies so future surveys cannot recreate a parallel architecture.
- Cannot be mistaken: New-creation and phase-transition decisions must be blocked by explicit evidence plus Owner decision records, not inferred from
ANSWEREDstatus alone. - 100% automatic: Not applicable to this read-only audit. For future operation, the catalog must require machine-detectable freshness, liveness, concurrency, and cleanup evidence before pilot.
STATUS: HOLD
EXECUTIVE SUMMARY
- The Question Catalog has strong coverage and the right stated philosophy: reuse-first, workspace-first, delete-fast, minimal IO boundary, checker verdict-only, atomic promote, scanner/report separated from auto-fix, and explicit operational risks.
- It is not ready for Owner approval yet because its current gate model can itself create the next monster: approximately 318 questions and about 47 blockers are treated too globally. L0 can block all later work rather than only blocking new design/creation for the selected pilot slice.
- It is not ready for an uncontrolled next survey. A tightly scoped read-only survey can start only after five wording-level fixes: source authority/version pin, vertical-slice gate scope, explicit Owner decision checkpoints, earlier placement of operational gates, and runtime liveness/config-delivery questions.
- The most serious implementation risk is not missing architecture. It is agents building from a draft revision that differs from the Owner-approved revision. The referenced
laws-newfiles exist in Agent Data KB but are absent from this checkout. - The most serious operational risk is not merely a long transaction. It is a system that produces a valid-looking stale verdict while the config source, scheduler, scanner, or cleanup path is unavailable or drifted.
- Existing assets should be treated as the default path. Checkout evidence shows mature reusable patterns around
meta_catalog,collection_registry,dot_tools,birth_registry,universal_edges, andsystem_issues. The survey must prove the exact live capability before proposing anything new.
COVERAGE AUDIT
| Area | Verdict | Missing / issue | Minimal patch if needed |
|---|---|---|---|
| Reuse Baseline | covered | L0 is globally blocking and can force a full-system survey before a small pilot | Add: “L0 blocks new design/creation only for the selected pilot slice and its direct dependencies; unrelated groups may remain DEFER.” |
| Birth / minimal canonical identity | covered | Birth-fold gates are not scoped to atomic promote | State that birth-fold blocks atomic promote/pilot promote, not read-only checker design or workspace-only pilot. |
| Governance as relationship/state information | covered | Drafts still contain wording that can make each governance fragment a DOT+stamp subsystem | Add one question: “Can this governance fact remain metadata/edge/state without a new DOT or stamp?” |
| Registries / existing SSOT reuse | covered | No explicit authority check for KB draft versus checkout/runtime source | Add source-authority/revision/hash question before survey. |
| Staging / candidate store | covered | Actual staging substrate is still unverified; local checkout has no iu_staging_* evidence |
Keep as survey priority; add proof of authoritative environment/source. |
| Collections / layer stores | covered | Risk of standardizing every layer store before the first slice | Add question asking which existing store is sufficient for the selected pilot slice and what can be deferred. |
| Matrix cell / cell_id | covered | Six-level draft model conflicts with the seven composition levels in the mandatory skill/constitution baseline | Add explicit authority-resolution question before accepting cell_id. |
| Formula / procedure | covered | Formula work can drift into a new formula registry | Add explicit “existing object/metadata/plain artifact first; no formula registry by default.” |
| IO Contract minimal v0.1 | partially covered | Cross-draft conflict: minimal five-field boundary versus Quick Rules rule 17 / plan schema adding DOT/check/owner fields | Add one invariant question fixing the minimal boundary and separating optional evidence/check fields. |
| Stamp lifecycle | covered | Pre/post store split is good, but Quick Rules S3 contradicts it | Add cross-document consistency blocker. |
| Stamp governance declaration without bloat | partially covered | “one stamp/one inspector/one DOT” language can create column and DOT explosion | Add default: no new core stamp/column/DOT unless an existing core stamp cannot carry the evidence. |
| DOT capability without bloat | partially covered | DOT-CAP can become a new capability-governance system | Ask only for the minimum contract of each reused/target DOT; explicitly forbid a new DOT-capability registry/system in v0.1. |
| Atomic Promote | partially covered | Covers all-or-nothing and negative states, but not enough on crash recovery, retries, bypass privileges, and outbox consistency | Add four operational questions listed below. |
| Scanner/report | partially covered | Correctly separated from auto-fix, but L7 is not a hard pilot gate and liveness/freshness is missing | Require a minimal scoped detector plus heartbeat/freshness evidence before pilot. |
| Pilot cell | covered | Forbidden surfaces are present; missing hard scope budget and stop/abort criteria | Add pilot budget question: touched surfaces, maximum new artifacts, duration, rollback trigger. |
| Order / dependency | partially covered | L5b is too late and too monolithic; L0 is too global | Attach each risk question to the earliest affected layer and the selected pilot slice. |
| Operational Risk Gates | partially covered | Strong on locks/index/stale/cleanup/cell context; missing runtime availability, bypass, crash/outbox, liveness, retention/backpressure | Add the small question set in Operational Risk Audit. |
DEPENDENCY MAP AUDIT
- Verdict: HOLD. The conceptual order is mostly sound, but the gate topology is too coarse for the stated fastest-path objective.
- Hidden dependencies:
- Exact authoritative draft revision/hash and how runtime/config consumers receive it.
- Resolution of cross-document contradictions before survey answers can be trusted.
- Runtime authorization/bypass surface for checker and atomic promote.
- Coexistence/disable path with existing production behavior.
- Heartbeat/freshness of scanner, scheduler, cleanup, and config source.
- Crash/retry/outbox behavior after or around transaction commit.
- Ordering concerns:
RISK-CELL-*belongs with L2/L3, before IO/validation/stamp conclusions.- Stale/config binding belongs with L3/L5, before a checker verdict can be meaningful.
- Query-path/index and transaction-boundary questions belong during L4/L5 design, not only after checker.
- Cleanup/external artifact questions belong with L1 staging and before workspace delete-fast proof.
- A minimal scoped scanner/heartbeat must block L8 pilot; it cannot remain merely “nên có.”
- L5b assessment: Correctly blocks atomic promote rehearsal and pilot, but it is positioned as one late checkpoint. Convert it into risk gates attached to their earliest dependency, with a final L5b roll-up before rehearsal/pilot.
- Jump risk: The roadmap says “docs fixed → system check” and the implementation plan already proposes pilots, while the new Group R and catalog are not approved. Add an explicit rule that roadmap phase labels do not authorize survey/design/pilot transitions.
REUSE-FIRST AUDIT
- Verdict: Strong intent, HOLD on enforcement shape.
- Missing reuse questions:
- What is the authoritative live/runtime location and revision of the reused asset?
- Is the asset actually runnable/owned/maintained, or merely present/design-only?
- What existing generic DOT/check can serve multiple pieces without creating one DOT per piece?
- What existing disable/rollback path allows reuse without coupling the pilot to production?
- What is the lifecycle cost of new versus reuse, not only the initial build speed?
- New-creation guardrail assessment: The rule is strong but has two weaknesses:
- Requiring proof that “reuse is slower than new” can justify short-term duplication that becomes long-term debt.
- Requiring all five generic insufficiency proofs for every artifact is not context-sensitive and encourages ceremonial evidence.
- Minimal wording fix: “New is allowed only when the selected pilot slice cannot satisfy a named invariant using an existing live asset, metadata/config, or a thin wrapper; the exception must name owner, retirement/merge path, lifecycle cost, and Owner decision.”
- Reality check from checkout: Reuse candidates are concrete, not theoretical. Local evidence found references across code/scripts/docs for
meta_catalog(23 files),collection_registry(6),dot_tools(8),birth_registry(4),universal_edges(2), andsystem_issues(8). No local evidence was found foriu_staging_record,iu_staging_payload,governance_candidate_state, orsandbox_tac; these must not be assumed live from draft prose.
LEGO / MODULARITY AUDIT
- Verdict: PASS_WITH_MINOR_FIXES in philosophy; HOLD in gate execution.
- Remaining monster-system risks:
- The catalog itself becomes a full-system prerequisite rather than a per-slice decision tool.
- “Every piece has DOT/check/stamp” becomes one DOT, one stamp, or one column per piece.
- “100% IO Contract” becomes a contract registry or Module Contract First under another name.
- DOT-CAP becomes a DOT governance/capability subsystem.
- Candidate packet becomes a persistent packet store rather than a projection over existing staging.
- Scanner becomes full-system coverage before a scoped pilot.
- Suggested wording fixes:
- Define “100% IO Contract” as communication across independent piece boundaries only; internal implementation does not require a new contract artifact.
- State “one generic reused DOT may validate many pieces through config; no DOT-per-piece default.”
- State “candidate packet is a view/projection by default, not a new persisted store.”
- State “scanner scope is the pilot slice and direct dependencies first.”
- Add a hard cap for the first slice: no new registry/ledger/workflow; any new table/DOT/stamp requires Owner exception.
IMPLEMENTATION-RISK AUDIT
- Covered risks: building too much before pilot; new DOTs before reuse proof; Module Contract First regression; premature registries/ledgers/workflows; canonical governance in workspace; checker selftest; atomic promote rehearsal; pilot forbidden surfaces; rollback/delete-fast; broad ordering.
- Missing risks:
SRC-AUTH-001: Which exact document revision/hash is authoritative for survey and later runtime? How is mismatch between Agent Data KB, checkout, and deployed/runtime copies detected?SLICE-001: What is the single pilot slice, its direct dependency closure, and what questions are explicitly out of scope/deferred?DECISION-001: Which transitions require an explicit Owner decision record?ANSWERED + Mức 3must not self-authorize checker/pilot/new creation.COEXIST-001: Can the new path be disabled without changing or breaking the existing path, and is the existing path still authoritative during the pilot?BUDGET-001: What is the maximum touched-surface/new-artifact/time budget that triggers stop and simplification?
- Blockers to keep:
- Packet binding/hash before checker verdict.
- Resolved cell context before IO/validation stamps.
- Checker fail-closed selftest before lane.
- Atomic promote negative-state/concurrency rehearsal before pilot promote.
- Pilot forbidden surfaces and delete-fast proof.
- Critical gate correction: Current global rule allows progression when blockers reach
ANSWERED + Mức 3. Require explicit Owner checkpoint for: selecting the pilot slice, approving any new artifact exception, approving checker design start, and opening pilot.
OPERATIONAL-RISK AUDIT
- Covered risks:
- Long transaction / locks / deadlocks and read-only pre-check versus short commit transaction.
- JSONB query paths, missing indexes, and scanner scope.
- Stale verdict/stamp/config drift;
rule_version,config_hash, IO Contract hash. - Cleanup of
blob_ref, external artifacts/cache/vector artifacts, event/report decision. - UNKNOWN/PENDING context, CELL before IO/VALIDATION, IO binding to resolved
cell_id.
- Missing risks:
- Runtime config/source availability and parse/load failure.
- Privilege/bypass path that can write canonical without checker/atomic promote.
- Crash/retry boundary and consistency of canonical write, consumed status, and outbox/audit.
- Heartbeat/freshness alert when scanner/scheduler/cleanup/checker support is down or stale.
- Retention/cardinality/backpressure limits for packets, verdicts, issues, and outbox.
- Clock/TTL semantics and expiry comparison source.
- Group R assessment: Good foundation, but its 32 questions are not sufficient for operational approval and are attached too late. They are sufficient to start a scoped read-only survey after the five catalog fixes.
- Additional questions, if any:
RISK-RUN-001: If required-stamps/config cannot be loaded, parsed, or version-pinned, does checker fail closed and expose a machine-detectable reason?RISK-AUTH-001: Which roles/functions can bypass checker/atomic promote and write canonical or consumed state directly?RISK-CRASH-001: After crash/retry at every commit boundary, can the system produce double-promote, canonical-without-audit, or consumed-without-canonical?RISK-LIVE-001: How is scanner/scheduler/cleanup/config-source freshness measured, and what blocks pilot if it is stale/down?RISK-CAP-001: What retention/cardinality/backpressure thresholds bound packets, verdicts, issues, and outbox for the pilot?RISK-TIME-001: Which clock/source defines TTL and expiry, and how is clock skew handled?
GATE IMPACT AUDIT
| Question / group | Current classification | Codex view | Reason |
|---|---|---|---|
| L0 Reuse Baseline | Blocks all L1-L8 | Overclassified globally | Must block new design/creation for the selected slice, not unrelated read-only survey work. |
| Birth fold BIRTH-013/014 | BLOCKER | Keep, narrow scope | Blocks atomic promote/pilot promote; should not block workspace-only pilot or read-only checker design. |
| STG-012 cleanup scheduler | BLOCKER | Move/narrow | Blocker before delete-fast pilot, not before checker design/selftest. |
| STG-015 packet_hash | BLOCKER | Keep | Verdict is unsafe without stable binding. |
| CELL species/cell_id blockers | BLOCKER | Keep per slice | IO/validation context is invalid until resolved; do not require global taxonomy completion. |
| STAMP-GOV blockers | BLOCKER | Narrow to high-risk/canonical | General workspace/pilot should not inherit heavy governance. |
| DOT-CAP blockers | BLOCKER | Reduce to target-DOT contract | A generic capability system must not become a prerequisite. |
| Atomic Promote blockers | BLOCKER | Keep | Required before pilot promote; include concurrency/crash/bypass evidence. |
| PILOT-018 forbidden surfaces | BLOCKER | Keep | Directly prevents accidental production/canonical mutation. |
| RISK-AP blockers | BLOCKER | Keep, move earlier | Transaction risks must shape design before rehearsal. |
| RISK-IDX-001..004 | BLOCKER | Keep only for actual pilot query paths | Global index proof would overbuild; scoped query-budget proof is necessary. |
| RISK-STL-001/005/006/007 | BLOCKER | Keep, move earlier | A stale PROMOTE_OK invalidates checker safety. |
| RISK-GC-001..004 | REQUIRED | Conditional BLOCKER before delete-fast pilot | If external artifacts exist, delete-fast is unproven without cleanup behavior. |
| RISK-CELL-001..004 | BLOCKER | Keep, move to L2/L3 | Context must be valid before IO/validation/checker. |
| Scanner/report liveness | Not a hard gate | Underclassified | Pilot operation is unsafe if the detector can silently stop. |
| Owner transition decision | Missing | Add BLOCKER | ANSWERED is evidence status, not authorization. |
| Source authority/revision pin | Missing | Add BLOCKER before survey | Surveying or implementing against different draft revisions creates immediate drift. |
ANTI-BLOAT AUDIT
| Location | Risk | Suggested wording-level fix |
|---|---|---|
| Dependency Map L0 | Full catalog becomes prerequisite monster | Scope L0 and blockers to selected pilot slice + direct dependencies. |
| §2c new-creation rule | “New is faster” rationalizes duplicate architecture | Compare lifecycle cost and require owner exception + retirement/merge path. |
| De-bai Lego Protocol | Every piece gets its own DOT/stamp | “Reuse generic DOT + config first; no DOT/stamp per piece by default.” |
| Quick Rules S3 / Addendum stamp wording | inspect_* column and DOT explosion; contradicts pre-promote store split |
Limit inspect_* to proven canonical/post-promote needs; pre-promote uses existing staging metadata. |
| IO Contract wording across drafts | Minimal IO becomes Module Contract First | Freeze the five-field boundary; checks/evidence/owner are references or optional overlays, not mandatory contract expansion. |
| Candidate packet spec | Packet store/ledger appears | State packet is projection/view over existing staging by default; persistence requires reuse-insufficiency proof. |
| Formula group | Formula registry appears | State formulas remain ordinary governed objects/metadata unless reuse survey proves otherwise. |
| Governance modules | Registry/workflow per governance fact | State modules are information fields/edges/states, not deployable subsystems by default. |
| DOT-CAP | DOT capability/governance system appears | Limit to per-target-DOT minimum declaration; prohibit new registry/system in v0.1. |
| Scanner group | Full-system scanner/auto-fix appears | Scope to report-only pilot slice first; no auto-fix; reuse existing detector/query. |
| Domain/cell work | Ontology/domain tree appears | Keep domain as controlled tag for v0.1; unresolved items remain workspace-only. |
FINAL RECOMMENDATION
- Ready for Owner approval? no
- Ready for read-only survey? conditional — only a tightly scoped survey after the five wording fixes below
- Need another Codex audit after Owner approval? yes — after the L0/L1 scoped survey and before checker design/start
- Top 5 fixes before survey:
- Add source-authority/revision/hash gate covering Agent Data KB, checkout, and runtime/config location.
- Change global L0/L5b blocking into selected-pilot-slice + direct-dependency blocking.
- Add explicit Owner decision checkpoints;
ANSWERED + Mức 3is not authorization. - Move risk questions to their earliest dependency and make minimal scanner/heartbeat a pilot gate.
- Resolve/flag cross-draft contradictions: pre-promote stamp store, IO Contract boundary, and six-versus-seven composition levels.
- Top 5 risks during survey:
- Treating draft/KB claims as live production facts.
- Surveying the whole system instead of one pilot dependency closure.
- Turning each governance fact into a new DOT/stamp/column/registry.
- Using short-term “faster to create new” as justification for permanent duplicate architecture.
- Declaring operational readiness without liveness, bypass, crash/retry, and config-delivery evidence.
DO NOT IMPLEMENT
- Confirmed: no production mutation, no DB mutation, no checker implementation, no pilot, no registry/table/collection/DOT/index/worker creation, and no edits to
knowledge/dev/laws/orknowledge/dev/laws-new/.
Process Evidence (Steps 0–6)
- Step 0 — Foundation: Read skill, OR v7.58, Constitution current KB v4.6.3, Assembly First/DOT/scanner rules.
- Step 1 — Receive: One mission only: audit catalog quality and two risk classes.
- Step 2 — Think/design: N/A for feature design; this is a read-only audit. Evaluated objective, method, prerequisites, and dependency roadmap.
- Step 3 — Code: No implementation. Only this required audit report was created.
- Step 4 — Two hats: N/A; no code/deployment. Self-review performed against the requested ten audit sections.
- Step 5 — Verify: No production verification permitted by task. Read-only evidence pasted below.
- Step 6 — Report/update: Report created here. OR/TD/handoff not updated because this audit does not change enacted operations, technical debt state, or implementation state.
Read-only output evidence
$ wc -l knowledge/dev/laws-new/cau-hoi-khi-tai-cau-truc.md ...
wc: knowledge/dev/laws-new/cau-hoi-khi-tai-cau-truc.md: open: No such file or directory
...
0 total
checkout reference counts:
birth_registry 4
collection_registry 6
meta_catalog 23
dot_tools 8
universal_edges 2
canonical_fields 0
system_issues 8
event_outbox 0
registry_changelog 0
iu_staging_record 0
iu_staging_payload 0
governance_candidate_state 0
sandbox_tac 0
Agent Data full-document revisions read:
cau-hoi-khi-tai-cau-truc.md revision 59
de-bai-cai-tien.md revision 33
required-stamps.v0.1.json revision 6
promote-checker-v0.1-spec.md revision 11
matrix-stamp-governance-addendum.md revision 14
matrix-refactor-quick-rules.md revision 8
matrix-refactor-implementation-plan.md revision 5
roadmap-cai-tien.md revision 1