KB-3699

Risk Note — Hardcode in P10B Hash Extractor

3 min read Revision 1
s188risk-notehardcodep10bd32extractorconstitutiondieu43zero-trust

Risk Note — Hardcode in P10B Hash Extractor

Date: 2026-04-29

Trigger

User observed Agent patching /tmp/p10b/extract.py with hardcoded leaf heading keys:

leaf_heading_keys = ['s1','s2_p1','s2_p2','s2_p3','s2_p4',
                     's3_p1','s3_p2','s3_p3','s3_p4','s3_p5',
                     's4_p1','s4_p2','s4_p3','s5','s6','s7','s8','s9']
for k in leaf_heading_keys:
    heading_excl.add(markers[k])

Assessment

This is hardcoding.

It may be acceptable as a one-off evidence patch only if the task is explicitly bounded to a single fixed D32 candidate and does not become reusable pipeline logic.

It is not acceptable as the future segmentation/extraction mechanism.

Relevant principles / laws

  • Zero Trust: if it is not certainly right, it is wrong.
  • Đ43 Red Zones: no case-dispatch per section; no built-in renderer fallback; no business logic in Nuxt.
  • General anti-hardcode principle: do not solve only one instance in a way that breaks when context changes.
  • P10 roadmap: future pipeline must handle variable document structure, auto numbering, tree view, split/merge/reorder, and 5–10x label growth.

Risk

Hardcoded heading keys solve the current D32 shape but break if:

  • a section is added;
  • a section is deleted;
  • headings are renamed;
  • nesting depth changes;
  • a document uses Roman numerals or a different heading convention;
  • future auto-numbering/reorder/split/merge changes structural keys.

This turns a data-driven system into a one-document script.

Immediate handling

Do not interrupt the currently running Agent task. Let it finish and report.

After report:

  1. Treat the hardcoded script as temporary evidence tooling only.
  2. Do not promote /tmp/p10b/extract.py to canonical pipeline.
  3. Require the next prompt/review to flag whether any hardcoded section list exists.
  4. If hardcoded list affected results, request a patch using candidate JSON/table-driven extraction.

Correct future approach

Extractor should be data-driven:

  • source markers derived from the segmentation candidate table/JSON;
  • no per-document hardcoded key list in code;
  • heading exclusions computed from candidate units with source_marker and structural metadata;
  • coverage derived from source spans, not manually enumerated keys;
  • candidate JSON is the input contract;
  • code works for D32, D35, D28, and later documents without rewriting source-specific arrays.

Direction to Opus after Agent completes

When reviewing Agent report, Opus must check:

  • Did the hardcoded key list only affect a one-off evidence report?
  • Are all results independently verifiable from the report?
  • Does the report clearly mark the script as temporary?
  • Is there any persistent file/commit containing hardcoded D32-specific logic?
  • Does the next P10B-1B prompt prohibit hardcoded section lists and require candidate-driven extraction/package generation?

Status

Risk recorded. Await Agent report before deciding patch/remediation.