KB-5999

02 · Body-required mid-flight patch · root cause + fix

6 min read Revision 1
dieu44iu_cutphase_broot_causefn_cut_mark_staged_filefn_iu_create

02 · Body-required mid-flight patch · root cause + Phase B fix

Symptom (mig 052 Phase 4 pilot)

During the Điều 38 v3.0 pilot, the first fn_cut_apply invocation raised body required at piece-level inside fn_iu_cut_from_manifest. Operator recovered by jsonb_set-patching iu_staging_payload.payload_json->'pieces' to add content_text per piece, then retried fn_cut_apply successfully. That jsonb_set patch was a manual workaround — not production behavior.

Root cause (located via catalog read of the 3-function chain)

layer function relevant line behavior
operator fn_cut_mark_staged_file(uuid, jsonb p_pieces, text, text) (pre-053) array-non-empty only accepted any non-empty p_pieces array; no per-piece schema check
alias fn_iu_op_mark_file(text, text, jsonb p_pieces, …) line 81–83 refused only NULL/non-array/empty; otherwise serialized as-is into manifest.pieces
writer fn_iu_mark_create_manifest(…) n/a writes manifest into iu_staging_payload.payload_json (part_name='cut_manifest')
reader fn_iu_cut_from_manifest(uuid, bool, text, text) line 197: v_body := v_piece->>'content_text'; reads each piece's content_text and passes as p_body to fn_iu_create
creator fn_iu_create(text p_canonical_address, text p_title, text p_body, …) line 19: IF p_body IS NULL THEN RAISE EXCEPTION 'body required'; END IF; hard-refuses NULL body

Bug locus: when the Agent supplied a pieces array where any element omitted content_text (or supplied an empty string), neither the wrapper, the alias, nor the manifest writer detected it. Only the CUT consumer raised — far downstream from the input. The mark_manifest staging payload was already COMMITTED at that point, requiring jsonb_set to repair before retry.

Same gap existed for canonical_address: fn_iu_create:17 raises canonical_address required if it's NULL/blank.

Phase B fix (mig 053)

Per user decision (validation site = operational wrapper only; alias contract preserved), the fix is added inside fn_cut_mark_staged_file immediately after the existing array-non-empty check:

FOR v_piece IN SELECT value FROM jsonb_array_elements(p_pieces)
LOOP
  v_body_val := v_piece->>'content_text';
  v_addr_val := v_piece->>'canonical_address';
  IF v_body_val IS NULL OR btrim(v_body_val) = '' THEN
    RAISE EXCEPTION 'fn_cut_mark_staged_file: piece[%].content_text is required (would cause "body required" at CUT)', v_idx;
  END IF;
  IF v_addr_val IS NULL OR btrim(v_addr_val) = '' THEN
    RAISE EXCEPTION 'fn_cut_mark_staged_file: piece[%].canonical_address is required', v_idx;
  END IF;
  v_idx := v_idx + 1;
END LOOP;

Indexed error messages locate the bad row inside the Agent's piece array. The check runs before the mark_in_progress transition is recorded and before fn_iu_op_mark_file is invoked — so a malformed pieces array leaves no durable side effect.

Why not at the alias level

fn_iu_op_mark_file is also reachable from non-cut_request flows and is a governed alias (the §11 "5-alias" / now 6-alias contract). Tightening the alias would have:

  • changed alias prosrc md5 (750b06b610f50065f1117961813d9df4 baseline)
  • risked refusing legitimate callers (fn_iu_mark_article, etc.)
  • crossed the explicit forbidden "MARK/CUT alias contract rewrite unless unavoidable and documented"

Operational wrapper-level validation is sufficient because the only path to the operational pipeline runs through fn_cut_mark_staged_file. The existing alias remains permissive for other callers; the wrapper is strict.

Proofs (in BEGIN/ROLLBACK + live regression)

proof input expected actual
PB.1 piece missing content_text RAISE piece[0].content_text is required RAISE confirmed
PB.2 piece missing canonical_address RAISE piece[0].canonical_address is required RAISE confirmed
PB.3 valid piece against status=cleanup_scheduled cut_request RAISE cannot mark from status cleanup_scheduled RAISE confirmed (status-machine refusal AFTER schema check passes)
E.2 LIVE piece missing content_text against 777b1297… RAISE fn_cut_mark_staged_file: piece[0].content_text is required (would cause "body required" at CUT)
D31.4 p_pieces=NULL RAISE p_pieces must be non-empty jsonb array RAISE confirmed
D31.5 p_pieces=[] RAISE RAISE confirmed
D31.6 piece missing both RAISE on content_text first (refused fast) RAISE confirmed

Surface impact

  • fn_cut_mark_staged_file md5 (full function definition) changes vs mig 052: new prosrc md5 = f9c4afdab7c581ce905fa38d72c167f9 (live, mig 053 post-apply).
  • 6-alias prosrc md5 UNCHANGED: 750b06b610f50065f1117961813d9df4.
  • Return value adds key piece_schema_validated: true so callers can positively confirm validation ran (forensic aid).

Why jsonb_set patch is no longer production behavior

The bad-piece case is now refused at the operational entry point, before any staging row is COMMITTED. There is no half-COMMITTED state for operators to repair. The Phase B fix eliminates the entire jsonb_set recovery flow described in [[feedback-mark-pieces-live-in-iu-staging-payload-cut-manifest-not-in-iu-staging-record-metadata]], which can now be downgraded from "production recovery pattern" to "emergency-only forensic procedure".

Back to Knowledge Hub knowledge/dev/laws/dieu44-trien-khai/v0.6-iu-cut-operational-pipeline-runtime-hardening/02-body-required-root-cause-and-fix.md