Reserved-Token Rejection Policy (recheck-6 blocker A)
02 - Reserved-Token Rejection Policy (recheck-6 blocker A)
Load-bearing copy: doc 00 §Canonical hash encoding (FIX7-CANON-V1) → Field rejection policy. This doc is the rationale + worked proof. Decision: REJECT, never escape — none of the legitimate field values (document_ids, hex hashes, enum tokens, integers, booleans, sentinels) ever needs a TAB, LF, CR, NUL, backslash, or a structural sentinel, so rejecting them outright is safe and makes the TAB/LF-delimited records provably injective (no value can contain a separator). No escape syntax = nothing to interpret.
Three layers (all fail-closed)
-
Per-field whitelist grammar (anchored, full-string). A value that does not match →
CANONICAL_FIELD_VALUE_GRAMMAR_REJECTED:field grammar document_id^knowledge/dev/reports/architecture/[A-Za-z0-9._/-]+\.md$sha256_hex^[0-9a-f]{64}$kb_revision^[1-9][0-9]*$orSELF_HOST_PIN_BY_EXCLUDE_REGION_HASHdoc_status^(ACTIVE_AUTHORITY|SUPERSEDED_NON_AUTHORITY)$boolean^(true|false)$active_section_id_or_range^(WHOLE_DOCUMENT|WHOLE_DOCUMENT_MINUS_SUPERSEDED_FENCES|WHOLE_DOCUMENT_MINUS_EXCLUDE_AND_SUPERSEDED)$fence_range^L[1-9][0-9]*-L[1-9][0-9]*$(begin < end)superseded_id^<document_id>#S[1-9][0-9]*$marker_kind^(DOC_STATUS|SUPERSEDED_BEGIN|SUPERSEDED_END|ENVELOPE_EXCLUDE_BEGIN|ENVELOPE_EXCLUDE_END|AUTHORITY_BOUNDARY)$marker_literal^<!--.*-->$(the ONLY field permitted to carry marker structure; forbidden-byte layer still applies)fixed-constant fields ( digest_algorithm,full_document_hash_policy,canonical_encoding_version)must equal the pinned constant byte-for-byte -
Forbidden-byte rejection. Any value containing TAB
0x09, LF0x0A, CR0x0D, NUL0x00, or backslash0x5C→CANONICAL_FIELD_RESERVED_TOKEN_REJECTED. -
Forbidden reserved-token rejection. Any value other than
marker_literalcontaining a structural sentinel →CANONICAL_FIELD_RESERVED_TOKEN_REJECTED. Forbidden list:<!-- ENVELOPE:EXCLUDE-BEGIN -->,<!-- ENVELOPE:EXCLUDE-END -->,<!-- SUPERSEDED_NON_AUTHORITY BEGIN,<!-- SUPERSEDED_NON_AUTHORITY END -->, and every domain tag (FIX7_*_V1). The bare tokensACTIVE_AUTHORITY/SUPERSEDED_NON_AUTHORITYare permitted only as adoc_statusvalue (layer 1).
No null / empty: null/absent → CANONICAL_FIELD_NULL_REJECTED (use NOT_APPLICABLE /
SEAL_AT_CODEX_RECHECK_7 / NON_AUTHORITY_DIAGNOSTIC); empty string → CANONICAL_FIELD_EMPTY_REJECTED.
Any status → STOP authoring → T1 fix + fresh Codex recheck (G-CANONICAL-FIELD-REJECT).
The "one level up" hole I closed myself
Three manifest-bound fields were previously free prose (digest_algorithm, full_document_hash_policy,
active_section_id_or_range). Free text is byte-exact but semantically loose — exactly the disguised-hardcode
class. Fix: the first two are fixed constants (grammar = exact match), and active_section_id_or_range
is a controlled vocabulary validated against the extractor's computed descriptor (doc 03). No free-text
authority field remains.
Computed evidence (doc 08, python == shasum)
TAB / LF / CR / NUL / backslash in a value → rejected; a reserved fence token inside a value → rejected;
null → rejected; empty → rejected; and after enforcing the policy the membership digest still reproduces
f2bda8effc7be19b54722828126b82d7d2d48bee5e5e5dc0c8f347ce210fe251. Each was executed, not asserted.