KB-481E

Điều 39 unit_kind — Root Cause Classification (primary: manifest-to-IU contract gap)

6 min read Revision 1
iu-cutdieu39root-causecontract-gapverify-mark2026-05-27

02 — Root Cause Classification

Symptom

CUT for Điều 39 (cut_request_id=146f1520-...) failed inside fn_iu_create because all 16 manifest pieces declared unit_kind=law_section, but live fn_iu_create accepts only values present in dot_config keys with prefix vocab.unit_kind. (currently design_doc_section and law_unit).

Effect: cut_request rolled back to mark_verified, cut_run_id=NULL, 0 IUs created.

Question matrix

# Question Answer
1 Where is unit_kind generated? At MARK call time, by the operator/Codex/agent that authored the per-piece p_pieces jsonb array. There is no code path in fn_cut_mark_staged_file, fn_iu_op_mark_file, or downstream MARK chain that fabricates unit_kind — it is supplied verbatim by the caller and passed unchanged into the manifest payload.
2 Why did MARK produce law_section? Because the caller emitted that string. The MARK gate (cut_manifest_piece_schema_v1) validated 6 fields but did NOT check unit_kind at all — law_section slipped through untouched.
3 Is law_section a valid semantic label but invalid runtime unit_kind? Yes — it is semantically meaningful ("section within a law document") but is not registered in runtime vocab. Runtime accepts only design_doc_section and law_unit.
4 Where is live unit_kind vocab governed? Data-side: dot_config rows with key prefix vocab.unit_kind.. No tac_unit_kind_vocab FK table exists (unlike tac_section_type_vocab). No information_unit FK enforces it either. fn_iu_create consults this prefix via fn_iu_resolve_default(p_unit_kind, 'iu_create.default_unit_kind', 'vocab.unit_kind.').
5 Should law_section map to law_unit, or should live vocab include law_section? Neither. Per operator decision (strict-refusal strategy): callers MUST emit a runtime value directly. No semantic mapping (would be silent rewrite, lossy). No vocab widen (would pollute runtime vocab with semantic synonyms).
6 Why did VERIFY_MARK not catch this before approval? fn_iu_verify_mark only validated 3 axes (positions / piece_role+section_type / parent_local_id). It did not validate unit_kind against the runtime vocab. The manifest reached approved lifecycle with a value that CUT would later refuse — a contract gap.
7 Is this a contract gap, vocab governance gap, MARK generator bug, or fn_iu_create vocab bug? Primary = contract gap between MARK schema and CUT vocab. Secondary = VERIFY_MARK missing runtime vocab check. Tertiary = generator (operator/Codex) emitted semantic label. fn_iu_create is correct — it refuses unknown vocab as designed.

Classification

root_cause:
  primary:
    class: manifest_to_IU_contract_gap
    detail: |
      cut_manifest_piece_schema_v1 (enforced in fn_cut_mark_staged_file) validates
      6 piece fields but NOT piece.unit_kind. Manifests with piece.unit_kind not in live
      dot_config vocab.unit_kind.* can pass MARK and reach mark_verified/approved,
      but CUT (fn_iu_create) refuses them at apply time. Pipeline asymmetry: MARK
      schema is permissive, CUT vocab is strict.

  secondary:
    class: verify_mark_missing_runtime_vocab_check
    detail: |
      fn_iu_verify_mark axes A/B/C don't cover unit_kind. Approval can be granted
      even when CUT would reject. VERIFY_MARK is the last gate before operator
      approval — it should refuse any manifest whose CUT would fail at vocab.

  tertiary:
    class: generator_emitted_semantic_label
    detail: |
      The MARK caller (Codex/operator) emitted unit_kind=law_section instead of
      runtime unit_kind=law_unit. This is a caller-side data-quality issue, but
      it is ENABLED by the contract gap above; without the gap, callers would be
      forced to emit valid values upfront.

  decision:
    fix_strategy: strict_refusal_at_mark_boundary_plus_verify_mark_axis_d
    reason: |
      Mirrors the existing section_type validation pattern (already in
      fn_cut_mark_staged_file). No alias body change (5/5 MARK/CUT alias md5s
      remain pinned). No silent semantic mapping (preserves caller intent and
      avoids lossy rewrites of the stored manifest). No vocab widen (operator
      directive: callers must use runtime values). Defense in depth: refuse at
      MARK (boundary) AND at VERIFY_MARK (last gate before approval).

Why not the alternatives

Alternative Why rejected
Widen vocab.unit_kind.law_section Pollutes runtime vocab with semantic synonyms; user directive Do NOT widen fn_iu_create vocab.
Add mapping.unit_kind.law_section = law_unit Silent rewrite; lossy (manifest forensics would show law_unit even though caller said law_section). User directive: no silent mapping.
Modify fn_iu_cut_from_manifest to resolve unit_kind Alias body governance-pinned (md5 c5d556bc...); cannot change without contract review.
Modify generator only Doesn't close the contract gap — future callers could still emit invalid values.

Implication for future MARK callers

Callers must include in every piece:

{ "unit_kind": "law_unit" }    // for law documents
{ "unit_kind": "design_doc_section" }  // for design docs

Any other value causes hard refusal at fn_cut_mark_staged_file with the indexed error pattern:

piece[N].unit_kind <V> not in dot_config vocab.unit_kind.* (cut_manifest_piece_schema_v1)
Back to Knowledge Hub knowledge/dev/laws/dieu44-trien-khai/v0.6-iu-cut-dieu39-unit-kind-root-cause-and-contract-fix/02-root-cause-classification.md