KB-481E
Điều 39 unit_kind — Root Cause Classification (primary: manifest-to-IU contract gap)
6 min read Revision 1
iu-cutdieu39root-causecontract-gapverify-mark2026-05-27
02 — Root Cause Classification
Symptom
CUT for Điều 39 (cut_request_id=146f1520-...) failed inside fn_iu_create because all 16 manifest pieces declared unit_kind=law_section, but live fn_iu_create accepts only values present in dot_config keys with prefix vocab.unit_kind. (currently design_doc_section and law_unit).
Effect: cut_request rolled back to mark_verified, cut_run_id=NULL, 0 IUs created.
Question matrix
| # | Question | Answer |
|---|---|---|
| 1 | Where is unit_kind generated? |
At MARK call time, by the operator/Codex/agent that authored the per-piece p_pieces jsonb array. There is no code path in fn_cut_mark_staged_file, fn_iu_op_mark_file, or downstream MARK chain that fabricates unit_kind — it is supplied verbatim by the caller and passed unchanged into the manifest payload. |
| 2 | Why did MARK produce law_section? |
Because the caller emitted that string. The MARK gate (cut_manifest_piece_schema_v1) validated 6 fields but did NOT check unit_kind at all — law_section slipped through untouched. |
| 3 | Is law_section a valid semantic label but invalid runtime unit_kind? |
Yes — it is semantically meaningful ("section within a law document") but is not registered in runtime vocab. Runtime accepts only design_doc_section and law_unit. |
| 4 | Where is live unit_kind vocab governed? |
Data-side: dot_config rows with key prefix vocab.unit_kind.. No tac_unit_kind_vocab FK table exists (unlike tac_section_type_vocab). No information_unit FK enforces it either. fn_iu_create consults this prefix via fn_iu_resolve_default(p_unit_kind, 'iu_create.default_unit_kind', 'vocab.unit_kind.'). |
| 5 | Should law_section map to law_unit, or should live vocab include law_section? |
Neither. Per operator decision (strict-refusal strategy): callers MUST emit a runtime value directly. No semantic mapping (would be silent rewrite, lossy). No vocab widen (would pollute runtime vocab with semantic synonyms). |
| 6 | Why did VERIFY_MARK not catch this before approval? | fn_iu_verify_mark only validated 3 axes (positions / piece_role+section_type / parent_local_id). It did not validate unit_kind against the runtime vocab. The manifest reached approved lifecycle with a value that CUT would later refuse — a contract gap. |
| 7 | Is this a contract gap, vocab governance gap, MARK generator bug, or fn_iu_create vocab bug? |
Primary = contract gap between MARK schema and CUT vocab. Secondary = VERIFY_MARK missing runtime vocab check. Tertiary = generator (operator/Codex) emitted semantic label. fn_iu_create is correct — it refuses unknown vocab as designed. |
Classification
root_cause:
primary:
class: manifest_to_IU_contract_gap
detail: |
cut_manifest_piece_schema_v1 (enforced in fn_cut_mark_staged_file) validates
6 piece fields but NOT piece.unit_kind. Manifests with piece.unit_kind not in live
dot_config vocab.unit_kind.* can pass MARK and reach mark_verified/approved,
but CUT (fn_iu_create) refuses them at apply time. Pipeline asymmetry: MARK
schema is permissive, CUT vocab is strict.
secondary:
class: verify_mark_missing_runtime_vocab_check
detail: |
fn_iu_verify_mark axes A/B/C don't cover unit_kind. Approval can be granted
even when CUT would reject. VERIFY_MARK is the last gate before operator
approval — it should refuse any manifest whose CUT would fail at vocab.
tertiary:
class: generator_emitted_semantic_label
detail: |
The MARK caller (Codex/operator) emitted unit_kind=law_section instead of
runtime unit_kind=law_unit. This is a caller-side data-quality issue, but
it is ENABLED by the contract gap above; without the gap, callers would be
forced to emit valid values upfront.
decision:
fix_strategy: strict_refusal_at_mark_boundary_plus_verify_mark_axis_d
reason: |
Mirrors the existing section_type validation pattern (already in
fn_cut_mark_staged_file). No alias body change (5/5 MARK/CUT alias md5s
remain pinned). No silent semantic mapping (preserves caller intent and
avoids lossy rewrites of the stored manifest). No vocab widen (operator
directive: callers must use runtime values). Defense in depth: refuse at
MARK (boundary) AND at VERIFY_MARK (last gate before approval).
Why not the alternatives
| Alternative | Why rejected |
|---|---|
Widen vocab.unit_kind.law_section |
Pollutes runtime vocab with semantic synonyms; user directive Do NOT widen fn_iu_create vocab. |
Add mapping.unit_kind.law_section = law_unit |
Silent rewrite; lossy (manifest forensics would show law_unit even though caller said law_section). User directive: no silent mapping. |
Modify fn_iu_cut_from_manifest to resolve unit_kind |
Alias body governance-pinned (md5 c5d556bc...); cannot change without contract review. |
| Modify generator only | Doesn't close the contract gap — future callers could still emit invalid values. |
Implication for future MARK callers
Callers must include in every piece:
{ "unit_kind": "law_unit" } // for law documents
{ "unit_kind": "design_doc_section" } // for design docs
Any other value causes hard refusal at fn_cut_mark_staged_file with the indexed error pattern:
piece[N].unit_kind <V> not in dot_config vocab.unit_kind.* (cut_manifest_piece_schema_v1)