06 — Auto-Label / Grouping Rehearsal (Branch F)
title: 06 — Auto-Label / Grouping Rehearsal (Branch F) date: 2026-05-31 verdict: classification machinery = REUSE (exists, governed, pivot-countable); ONE gap = no per-species ungrouped-threshold column (THRESHOLD_REGISTRY_GAP)
06 — Auto-Label / Grouping Rehearsal (Branch F)
Binding clarification applied: 50 is a MAX ungrouped display ceiling, not a target. Reuse existing classification immediately; start classifying before a list becomes unmanageable; pagination ≠ grouping; labels must be PG-backed/governed/pivot-countable/registered (no frontend arrays).
A. Live label machinery (REUSE — verified structure)
| object | n | columns that matter |
|---|---|---|
label_rules |
38 | facet_id, rule_type, condition jsonb, result_label, priority, skip_wide_warning (a wide-list concept already in the engine), status |
taxonomy |
58 | code, facet_id, parent_id, parent_facet, depth, scope[], status, replaced_by, _dot_origin (governed tree, Đ24) |
taxonomy_facets |
10 | code, cardinality (single/multiple), max_labels_per_entity (0–3), status |
entity_labels |
718,744 | entity_code, label_code, rule_id, assigned_by, assigned_at → grouping a list = GROUP BY label_code (pivot-countable) |
species_collection_map |
164 | discriminator_field/value/operator/config → intrinsic grouping dimension |
→ No new label store is needed. The classification rule store (label_rules), the governed label tree (taxonomy+taxonomy_facets), the applied assignments (entity_labels), and the discriminator dimensions (species_collection_map) all exist. Work = (a) carry the threshold/decision on the contract, (b) resolve every label_ref to a registered row, (c) make group counts pivot-backed (PIV-31x, PIVOT_MISSING today).
B. classification_status / grouping_required rule (server-side; Nuxt holds no threshold)
classify(node):
if has_existing_grouping(node): # taxonomy/label_rules/species_collection_map present
classification_status='classified'; reuse immediately
elif node.children_count <= threshold(node.species): # threshold from PG (see GAP), default 50 = MAX
classification_status='inspectable'; grouping_required=false
else:
classification_status='classification_required'; grouping_required=true
dim = suggest_dimension(node) # taxonomy_matrix / facets / species attrs / discriminator
if dim is null: classification_status='LABEL_MISSING'; emit system_issues('label_missing') # propose type (doc 08)
C. Rehearsed view v_registry_label_grouping_required (BEGIN..ROLLBACK, doc 04 #7)
- Compiled over the 160-leaf set; 27 leaves exceed the default-50 ceiling (e.g. birth 980,378, entity_labels 690,341, system_issue 171,939, changelog 66,095, measurement_log 31,984, …). These are the lists that must be semantically grouped, not just paginated.
has_grouping_dim_available= true (discriminator fields exist inspecies_collection_map).
D. THE GAP — THRESHOLD_REGISTRY_GAP
taxonomy_facets.max_labels_per_entity governs how many labels of a facet an entity may carry — it is not an ungrouped-display ceiling. No column anywhere stores a per-species "max rows shown ungrouped" threshold. Therefore:
- The literal
50in the rehearsed view (and any in Nuxt) is a hardcode that must be removed. - Propose (additive, gated): a per-species threshold home — either a new
meta_catalog.max_ungrouped_threshold/entity_species.max_ungrouped_thresholdcolumn, or a smalldisplay_policy(species_code, max_ungrouped, default 50)table — born + registered + (optionally) pivoted. Until then,max_ungrouped_thresholdon the contract = the literal default flaggedTHRESHOLD_REGISTRY_GAP.
E. Label reuse map / missing-grouper gaps
| need | reuse | gap |
|---|---|---|
| classification rule store | label_rules (38) |
— |
| governed label tree | taxonomy(58)/taxonomy_facets(10) |
— |
| applied labels (for group counts) | entity_labels(718,744) |
no PIV-31x label-by-facet pivot → counts PIVOT_MISSING; also CAT-068 hidden drift (doc 03) |
| intrinsic dimension | species_collection_map.discriminator_* |
— |
| ungrouped display ceiling | — | THRESHOLD_REGISTRY_GAP |
F. Verdict
REUSE for all classification machinery; NEW (small) for the threshold home; NEW pivot PIV-31x for label counts. PASS conditions of doc-07 design met except the threshold must move from a constant to a PG column. No frontend label array, no hardcoded threshold permitted at implementation.