KB-4393

O10 automation completion bundle — 03-generic-parser-mark-cutplan-result

4 min read Revision 1
dieu44iu-cutterv0.6o10generic-parserdryrun

O10 Report 03 — Generic parser / mark / cutplan dry-run (BRANCH 2)

  • macro: v0.6-o10-automation-completion-program-bundle
  • date_utc: 2026-05-21 · gate: BRANCH 2 · result: PASS

1. The generic parser — parse_generic_document()

A minimal, document-agnostic, fail-closed parser. Standard-library only — it never opens a DB connection, reads a credential, or writes a row (same import-isolation discipline as dryrun.py).

grammar:
  unit heading : ATX line  '#'..'######' + whitespace + non-empty title
  unit body    : every line between a heading and the next heading (EOF),
                 leading/trailing blank lines trimmed
  preamble     : lines before the first heading — non-content, never cut
  sentinels    : optional BEGIN/END normalized-content pair; when present
                 only the region between them is parsed (v0.5-compatible)
per-unit output:
  index · title · body · unit_kind · section_type · heading_depth ·
  line_start · line_end · span_sha256 · content_hash
depth -> kind: 1=section · 2=article · >=3=principle

Fail-closed conditions (all unit-tested)

- invalid docprefix (not UPPERCASE [A-Z0-9-], 3-32, no lead/trail '-')
- reserved docprefix (ICX-CONST or DIEU* — address-space collision, O9 C1)
- malformed sentinel pair (each sentinel must appear once, or neither)
- zero headings
- a unit with an empty body (fn_iu_create requires a non-empty body)
- > max_units headings (default 500 — small-target intake)
- duplicate canonical_address

2. Feeds F2 / F3 generic path

F2 (live-text):  each MarkRow carries the parsed unit body + title; the
                 cutplan row contract carries them through to cut_leg_a.
F3 (recorders):  the O8F generic governance/verify recorders take any N;
                 the generic path exercises them with N=3 (not 60).

3. Mark / cutplan dry-run — replay determinism

The parser is pure; parse_generic_document called twice yields an equal GenericDocument. The orchestrator cutplan phase additionally runs a two-pass determinism check (different orderings → identical writer_digest).

Cross-host digest equality (Mac dev host == VPS):

fixture:          tests/fixtures/generic-small-target-sample.md (3 units)
docprefix:        O10-FIX
candidate_count:  3
manifest_digest:  5e58911d481e350a6c163100eb5fc7d43087718f5685d4e3a5667865782d08dd
region_sha:       a1a286e4802718234e608ac8b15ada4a2845d1b331055a7f34cd35dbe9e7b528
writer_digest:    58528e44927c871a48bc6f1d22ddeb9ff0b3e8cd5d93aa1089f0560c69f014f9

All three digests are identical on Mac and on the VPS — proving the parser + dry-run path is deterministic and host-independent.

4. Tests (small 2–5 unit fixture, no N=60)

tests/test_orchestrator_o10_generic_intake.py — 15 tests:

TestGenericParser:            7 — parse, determinism, all fail-closed routes
TestGenericIntakeDiscoverer:  1 — seeds a runnable discoverer
TestGenericDryRunWalk:        5 — kill-switch off, walk to SG_1, full
                                  closeout (lifecycle_enacted_count == 3
                                  → no N=60), Mode.LIVE refused, replay
                                  digest-equal
TestIcxConstHistoricalPath:   1 — ICX-CONST grammar still present
                              + 1 — kill-switch off

The full closeout walk reaches CLOSEOUT_REPORTED with all 11 phases passed and lifecycle_enacted_count == 3 — direct proof the orchestrator carries no N=60 (and no Constitution-shaped) assumption.

5. Verdict

branch_2: PASS — generic parser produces title/body/source-span; feeds
          F2/F3; deterministic + replay-equal + cross-host identical;
          15 tests on a 3-unit fixture; no production mutation.
Back to Knowledge Hub knowledge/dev/laws/dieu44-trien-khai/v0.6-o10-automation-completion-program-bundle/03-generic-parser-mark-cutplan-result.md