KB-3699

Risk Note — Hardcode in P10B Hash Extractor

3 min read Revision 1

s188risk-notehardcodep10bd32extractorconstitutiondieu43zero-trust

Risk Note — Hardcode in P10B Hash Extractor

Date: 2026-04-29

Trigger

User observed Agent patching /tmp/p10b/extract.py with hardcoded leaf heading keys:

leaf_heading_keys = ['s1','s2_p1','s2_p2','s2_p3','s2_p4',
                     's3_p1','s3_p2','s3_p3','s3_p4','s3_p5',
                     's4_p1','s4_p2','s4_p3','s5','s6','s7','s8','s9']
for k in leaf_heading_keys:
    heading_excl.add(markers[k])

Assessment

This is hardcoding.

It may be acceptable as a one-off evidence patch only if the task is explicitly bounded to a single fixed D32 candidate and does not become reusable pipeline logic.

It is not acceptable as the future segmentation/extraction mechanism.

Relevant principles / laws

Zero Trust: if it is not certainly right, it is wrong.
Đ43 Red Zones: no case-dispatch per section; no built-in renderer fallback; no business logic in Nuxt.
General anti-hardcode principle: do not solve only one instance in a way that breaks when context changes.
P10 roadmap: future pipeline must handle variable document structure, auto numbering, tree view, split/merge/reorder, and 5–10x label growth.

Risk

Hardcoded heading keys solve the current D32 shape but break if:

a section is added;
a section is deleted;
headings are renamed;
nesting depth changes;
a document uses Roman numerals or a different heading convention;
future auto-numbering/reorder/split/merge changes structural keys.

This turns a data-driven system into a one-document script.

Immediate handling

Do not interrupt the currently running Agent task. Let it finish and report.

After report:

Treat the hardcoded script as temporary evidence tooling only.
Do not promote /tmp/p10b/extract.py to canonical pipeline.
Require the next prompt/review to flag whether any hardcoded section list exists.
If hardcoded list affected results, request a patch using candidate JSON/table-driven extraction.

Correct future approach

Extractor should be data-driven:

source markers derived from the segmentation candidate table/JSON;
no per-document hardcoded key list in code;
heading exclusions computed from candidate units with source_marker and structural metadata;
coverage derived from source spans, not manually enumerated keys;
candidate JSON is the input contract;
code works for D32, D35, D28, and later documents without rewriting source-specific arrays.

Direction to Opus after Agent completes

When reviewing Agent report, Opus must check:

Did the hardcoded key list only affect a one-off evidence report?
Are all results independently verifiable from the report?
Does the report clearly mark the script as temporary?
Is there any persistent file/commit containing hardcoded D32-specific logic?
Does the next P10B-1B prompt prohibit hardcoded section lists and require candidate-driven extraction/package generation?

Status

Risk recorded. Await Agent report before deciding patch/remediation.