KB-A569

33 — Governance Input-Quality Gate Design (Branch C, 10 input states, bad-input ≠ governance-orphan, design-only, read-only zero mutation, 2026-06-01)

16 min read Revision 1
one-roof-governanceimplementation-indexinput-quality-gatebranch-cinput-provenancesource-trustschema-contract-validationregistry-visibilityduplicate-conflictstale-inputlate-arrivingcorrected-datamerge-supersessionsource-driftten-input-statesbad-input-issue-routingnot-governance-orphanbirth-precedenceno-islanddesign-only2026-06-01

33 — Governance Input-Quality Gate Design (Branch C)

Path: knowledge/dev/reports/architecture/one-roof-governance-technical-addendum-and-implementation-index-2026-06-01/ Doc: 33. Track: Branch C of the Backfill / Handoff / Input-Control addendum. Builds on docs 31 (backfill), 32 (handoff), 24/25 (T7/T6), concept canon 01–02, GPT direction (governance input control). Status: DESIGN ONLY. No PG/Directus/Qdrant/Nuxt mutation; no table/view/function/trigger; no event/DOT registration; no emit; no approval/enactment. This doc specifies the gate's contract and states, not a running gate. Owner (proposed): GOV-SIV runs the input gate (read/validate/classify); bad inputs become input-quality / system issues, never silent classifications and never mislabeled governance-orphans. It proposes; it never self-applies. Evidence base: live read-only re-verified 2026-06-01.


0. §0-GOV declaration

§0-GOV Governance Coverage Declaration — Branch C (Input-Quality Gate)
  governed_objects:   [ input_quality_verdict ] (Class-2 process record on the candidate-state row)
  owner_per_scope:    { policy: GOV-COUNCIL, health: GOV-SIV, execution: GOV-DOT,
                        render: GOV-MOUT(TTL), approval: Điều32, audit: GOV-SIV }
  coverage_profile:   [ validation profile — required-fields, schema-contract, provenance, dedup ]
  axes_introduced:    [ none ]
  detection_path:     handoff signal (doc 32) + candidate key (doc 31/34) + birth_registry visibility
  issue_event_types:  [ input_incomplete, input_invalid_schema, input_duplicate, input_conflict,
                        input_stale, input_untrusted_source, input_quarantined ]
                      (register-before-emit, Điều 45 — NOT registered)
  exceptions:         [ none minted ]

1. Why an input gate (the systemic-wrongness problem)

GPT direction: "If input is not controlled, classification can be technically correct but systemically wrong. The scanner must know what it is allowed to classify, from which source snapshot, under which ruleset, and whether the upstream input is complete and trustworthy enough."

The T6 scanner (doc 25) currently assumes its inputs are clean: L1 enumerates, L2 drops unborn, L3+ classify. It has no gate for incomplete, invalid, duplicate, conflicting, stale, untrusted, or unregistered input. Without one, two failure modes appear:

  • False governance-orphans — an object that is merely incompletely registered or not-yet-visible gets classified governance_orphan, generating a wrong, noisy finding and (worse) a wrong remediation proposal.
  • Silent mis-classification — a duplicate/conflicting/stale input is classified as truth, corrupting the coverage invariant.

Branch C inserts a pre-classification input-quality gate between the handoff/candidate intake and the T6 detector's owner/profile layers (L3+). It is the governance analog of a birth gate, but for input fitness, not existence.


2. Placement in the layered architecture

Birth (Đ0-G/Đ19)  →  Registry (Đ2)  →  [ Handoff intake (doc 32) ]
        →  [ INPUT-QUALITY GATE (this doc) ]  →  [ Candidate scan (doc 34) ]
        →  Coverage detector (T6 L3–L6)  →  Production gate (fail-closed)

The gate runs inside the candidate-scan entry (logically T6 "L0", before L1/L2 owner work). Every object/handoff that reaches classification has first been assigned an input-quality state (§4). Only accepted_for_candidate_scan flows into owner/profile classification; all other states route to an input-quality issue and hold the object out of governance-orphan classification.


3. What the gate validates (mission §6 coverage)

Control Source of truth (live) Pass condition
Input provenance handoff source/source_system/actor_ref (doc 32 §7); birth_registry.dot_origin provenance present + resolvable
Source trust level a trust policy (config; C-7) keyed by source_system/producer source at or above the min trust for its target scope
Required fields the object class's coverage profile (M-DEF-2) + birth_registry NOT-NULL contract (entity_code, collection_name, canonical_address) all profile-required identity fields present
Schema / contract validation meta_catalog/table_registry.fields/class registry shape the input row conforms to the class's declared shape; Đ45 payload_classification='safe_metadata' for handoff signals
Registry visibility birth_registry (born) + the class's member registry the object is born and registered before governance coverage is attempted (GPT: "registry visibility requirement before governance coverage scan")
Duplicate detection canonical_address / idempotency key (doc 31 §6) exactly one live inventory row per canonical_address at the governance grain
Conflict detection competing owner/scope rows (SB-2, when live) no two authoritative claims for (object × scope)
Stale snapshot detection source_snapshot_ref vs current snapshot (doc 31 §5; reuse evolution_snapshots) the input was captured under the current (or a still-valid) snapshot
Late-arriving data occurred_at vs the candidate's last scan_time recognized and ordered, not dropped (§5)
Corrected data a newer change for the same key (doc 32) supersedes the prior input; re-opens the candidate (§5)
Merge / supersession supersedes_id chains / retirement (doc 31 §7) the surviving record is the candidate; merged-away records → retired
Source drift business_logic_hash / source content hash drift (reuse system_issues.business_logic_hash) source shape matches the contract it was registered under
Bad-input issue routing system_issues (T7) every reject becomes a routed issue, never a silent drop or a governance-orphan

4. The ten input states (mission §6) — definitions, predicate, routing, NOT-orphan rule

Each candidate-state row (doc 34) carries an input_quality_state. Only accepted_for_candidate_scan proceeds to owner/profile classification. All others suppress governance-orphan classification and route to an input-quality issue — this is the core "bad input ≠ governance-orphan" rule.

State Definition Detect predicate system_issues type / bucket Route Blocks orphan-classification?
accepted_for_candidate_scan input is complete, in-contract, visible, trusted, fresh, unique all §3 controls pass — (proceeds) → candidate scan (doc 34) n/a
incomplete_input required identity/profile fields missing a profile-required field is null/absent input_incomplete / thiếu_quan_hệ the source/registry owner YES — held, not orphaned
invalid_schema input violates the class contract / carries data not signal shape mismatch or Đ45 payload violation input_invalid_schema / sai_lệch_dữ_liệu the source owner + GOV-SIV YES
duplicate_candidate >1 inventory row → same canonical_address dedup key collision at governance grain input_duplicate / sai_lệch_dữ_liệu the source owner (dedup) YES — collapses to one
conflict_candidate competing authoritative claims for (object×scope) two active owner/scope rows input_conflict → reuses owner_conflict (T7 #4) GOV-COUNCIL YES — adjudicate first
stale_input captured under a superseded snapshot/ruleset source_snapshot_ref ≠ current & no longer valid input_stale / sai_lệch_dữ_liệu GOV-SIV (re-snapshot) YES — re-evaluate, fail-closed
untrusted_source producer below min trust for the target scope trust policy (C-7) check fails input_untrusted_source / thiếu_quan_hệ GOV-COUNCIL (trust decision) YES — cannot inject work
birth_or_registry_missing object unborn/unregistered not in birth_registry / no member row reuse birth-orphan thiếu_mã_định_danh/thiếu_quan_hệ (Đ19) birth scanner (Đ19) YESM-DEF-4 precedence: birth's job, 0 governance issue
needs_backfill born/visible but never seeded into candidate-state no candidate-state row under current ruleset backfill_inventory_gap (doc 31 §10) GOV-SIV → Branch A sweep YES — route to backfill, not orphan
quarantined failed-after-retry / poison / repeated source drift DLQ (doc 31 §8 / 32 §6) or N failed validations input_quarantined / silent_fail GOV-SIV + source owner YES — isolated, visible, not classified

The NOT-orphan invariant (Branch C's reason to exist): a governance_orphan (T7 #1) may be raised only when input_quality_state = accepted_for_candidate_scan and the object is born/registered and no owner resolves. Any other state produces an input-quality finding instead. This makes "object is genuinely ungoverned" distinguishable from "we couldn't trust/complete/locate the input," which the GPT council identified as the systemic-wrongness risk.


5. Late-arriving, corrected, merged, drifting data (mission §6)

  • Late-arriving — a handoff whose occurred_at predates the candidate's last scan_time: it is not dropped; it dirties the group (doc 34) and is re-evaluated; ordering by occurred_at ensures the latest truth wins. A late birth never silently becomes a permanent orphan.
  • Corrected — a newer change for the same key supersedes the prior input: the candidate-state row is dirtied with dirty_reason='corrected', the prior input-quality verdict is cleared, and re-validation runs under the current snapshot/ruleset. Any open input_* issue auto-closes when the correction passes the gate.
  • Merge / supersessionsupersedes_id chains resolve to the surviving record as the candidate; merged-away records get verdict retired (doc 31 §7) and are removed from the live coverage denominator (counted in retired/ignored).
  • Source drift — when a source's content hash drifts from the contract it was registered under (business_logic_hash change without a registered schema change), the gate raises governance_schema_drift (T7 #16) and sets stale_input until the contract is reconciled — protecting downstream classification from silently consuming a changed shape.

6. Interaction with backfill (doc 31), handoff (doc 32), candidate scan (doc 34)

  • Backfill items pass through the gate during seeding; a backfill item that fails the gate is not seeded as relevant/orphan but as its input-quality state (e.g. needs_backfill is impossible for a backfill item, but incomplete_input/birth_or_registry_missing are common and correctly yield to birth).
  • Handoff signals carry provenance/trust the gate consumes (doc 32 §7); a handoff that violates Đ45 (payload_classification ≠ safe_metadata) → invalid_schema, captured and flagged, never enqueued as truth.
  • Candidate scan only ever sees accepted_for_candidate_scan inputs for owner/profile classification; everything else is parked in a state with an issue and a route, and re-enters the gate when dirtied (corrected/late/snapshot-refreshed).

7. Suppression & precedence (consistent with T7 §9)

The gate's NOT-orphan suppression composes with T7's suppression precedence (highest first):

  1. Birth precedence (M-DEF-4)birth_or_registry_missing yields to Đ19; one root cause, shared coalesce_key.
  2. Input-quality hold (this doc) — any non-accepted input state holds governance-orphan/owner classification and substitutes an input-quality issue.
  3. Granted governed exception (M-DEF-6) — an active exception suppresses the matching finding for its TTL (stale fingerprint lifts it, fail-closed).
  4. Class-0 (M-DEF-1) — private/user-scoped objects suppressed unless they reach shared truth.
  5. Anti-hiding floor (M-DEF-7) — authority-critical gaps never suppressed by inheritance.

No suppression by self-approval; every hold is a recorded input-quality fact + a routed issue.


8. Input-quality issue types (proposed, register-before-emit — NOT registered)

All ride system_issues (free-text issue_type, reuse buckets) + the governance event domain; anti-spam (coalesce/cooldown/ceiling/digest) reused from T7. Coalesce key pattern: gov:input:{state}:{coalesce_anchor}.

issue_type bucket severity (base→max) detection event auto-close
input_incomplete thiếu_quan_hệ low→medium governance.input.incomplete required fields present
input_invalid_schema sai_lệch_dữ_liệu medium→high governance.input.invalid_schema conforms to contract
input_duplicate sai_lệch_dữ_liệu low→medium governance.input.duplicate dedup resolved to one
input_conflict (→owner_conflict) sai_lệch_dữ_liệu high governance.input.conflict one authoritative claim remains
input_stale sai_lệch_dữ_liệu medium governance.input.stale re-validated under current snapshot
input_untrusted_source thiếu_quan_hệ medium→high governance.input.untrusted_source source trust established (C-7)
input_quarantined silent_fail high governance.input.quarantined gate passes after correction

(birth_or_registry_missing reuses the Đ19 birth-orphan types; needs_backfill reuses backfill_inventory_gap, doc 31.)


9. No-hardcode / no-island

  • No-hardcode — required fields ← coverage profile (M-DEF-2 registry) + birth_registry NOT-NULL contract; schema ← meta_catalog/table_registry.fields; trust levels ← a config trust policy (C-7), not a code list; snapshot/ruleset ← evolution_snapshots/governance_ruleset (docs 31/34). No source/trust/field literal is embedded.
  • No-island — one issue store (system_issues), one event path (event_outbox, Đ45), one audit (registry_changelog); the gate reuses T7's taxonomy and anti-spam. It mints no parallel validation service or input table — the verdict lives on the candidate-state row (doc 34).

10. Dependencies, gates, NO-GO

Capability Needs Status
Validate provenance/required-fields/schema/visibility (read-only) nothing live designable now
Conflict detection SB-2 views gated; degrades pre-SB-2
Persist input_quality_state candidate-state store (SB-10, doc 34) NO-GO
Emit input-quality findings governance domain registered+active (SB-11/SB-4) NO-GO
Source-trust policy + min-trust per scope C-7 (council) decision pending
Stale detection vs snapshot snapshot/ruleset registry (SB-12) NO-GO

No gate may be satisfied by self-approval.


11. Verdict

Branch C input-quality gate design: COMPLETE. All mission-mandated controls (provenance, source trust, required fields, schema/contract, registry visibility, duplicate, conflict, stale, late-arriving, corrected, merge/supersession, source drift, bad-input routing) are specified, and all ten input states are defined with detect predicate, issue type/bucket, route, and the NOT-orphan rule. The gate sits as a pre-classification "L0" inside the candidate scan; bad inputs become routed input-quality issues, never silent classifications and never mislabeled governance-orphans (the systemic-wrongness fix). It composes with T7 suppression precedence and reuses system_issues/event_outbox/registry_changelog + the coverage-profile/meta_catalog registries — no island, no hardcode. Nothing registered, emitted, or mutated. Next: doc 34 (the incremental dirty-group candidate scan the gate feeds).

Back to Knowledge Hub knowledge/dev/reports/architecture/one-roof-governance-technical-addendum-and-implementation-index-2026-06-01/33-governance-input-quality-gate-design.md