KB-4C86

02 — Leaf-Set Definition (Branch B)

4 min read Revision 1
registries-pivotleaf-setbranch-baccounting-invariantrollupdata-drivenno-hardcode2026-05-31

title: 02 — Leaf-Set Definition (Branch B) date: 2026-05-31 verdict: LEAF_SET_RULE = EXPLICIT + DATA-DRIVEN (no hardcoded code list); 160 leaf / 9 rollup

02 — Leaf-Set Definition (Branch B)

The accounting invariant total = counted + orphan + phantom is only meaningful over a set with no nested rollups. meta_catalog (169 rows) mixes leaf object-sets with rollup/meta summary rows; a blind SUM double-counts — the exact disguised-math trap Đ28 forbids.

composition_level distribution (live, all 169 rows)

composition_level rows Σrecord Σactual Σorphan
atom 77 3,635,579 3,836,259 0
molecule 51 1,670 1,677 0
compound 34 846 749 0
material 2 58 113 0
meta 3 517 null 161
product 1 0 0 0
building 1 0 0 0

The 9 rollup/meta rows (must be EXCLUDED from the invariant)

code entity_type composition_level record actual note
CAT-ALL all atom 1,682,270 1,919,748 grand rollup of all atoms — sits inside the atom level → the double-count trap
CAT-MOL molecule_total molecule 774 766 per-layer rollup
CAT-CMP compound_total compound 423 326 per-layer rollup
CAT-MAT material_total material 0 55 per-layer rollup
CAT-BLD building_total building 0 0 per-layer rollup
CAT-PRD product_total product 0 0 per-layer rollup
CAT-DOT dot_total meta 307 null DOT registry rollup (carries 140 orphans)
CAT-COL collection_total meta 168 null collection registry rollup (20 orphans)
CAT-SPE species_total meta 42 null species rollup (1 orphan)

LEAF-SET RULE (data-driven; no hardcoded code list)

leaf_set := meta_catalog
  WHERE composition_level <> 'meta'        -- excludes CAT-DOT/COL/SPE
    AND entity_type NOT LIKE '%_total'     -- excludes CAT-MOL/CMP/MAT/BLD/PRD
    AND entity_type <> 'all'               -- excludes CAT-ALL (lives inside 'atom')
  • Live result: 160 leaf rows (169 − 9). Cross-check inside the rehearsal: meta_rollup_record_excluded = 517 = 307+168+42 ✓ (the three meta rows), and CAT-ALL/layer-totals excluded by entity_type.
  • The rule keys on existing data columns only (composition_level, entity_type) — there is no hardcoded list of CAT codes. It survives new categories: any new *_total/all/meta row is auto-excluded; any new real object-set is auto-included.

Which rows are what

  • Leaf / real object-set (160): each maps to a concrete substrate via source_location (e.g. Directus:birth_registry, Directus:entity_labels, File:dot/bin/). Top leaves: CAT-023 birth 980,378 · CAT-068 entity_labels 690,341 · CAT-017 system_issue 171,939 · CAT-016 changelog 66,095 · CAT-141 measurement_log 31,984.
  • Rollup / derived-summary (9): the table re-aggregates leaves (per-layer *_total, the global all, and the three meta registry totals). These are display conveniences, not object-sets.
  • Excluded from invariant computation: the 9 rollups.

Residual gaps (recorded, not hidden)

  • LEAF_SET is necessary but not provably MECE. Even within the 160 leaves there may be overlap (e.g. a "junction_table" count vs an "all" count of the same physical rows). The rule removes the known rollups; it cannot guarantee mutual exclusivity. → The true cross-check anchor is PIV-500 (a grand-total pivot over the universal object substrate, e.g. birth_registry), not a catalog re-sum. PIV-500 is PIVOT_MISSING (doc 03).
  • 5 leaf rows have actual_count IS NULL (leaf_actual_null = 5) → those leaves are count_integrity_status='unverified' (can't be reconciled until scanned).
  • The leaf rule lives in PG (a view v_registry_leaf_set, rehearsed in doc 04). It must never be re-implemented as a Nuxt filter (Đ28 NT-D1).
Back to Knowledge Hub knowledge/dev/reports/architecture/registries-pivot-p0-p1-count-integrity-view-rehearsal-2026-05-31/02-leaf-set-definition.md