04 — Auto-Label / Grouping Policy (50 = MAX Ungrouped Ceiling) + List Contract
title: 04 — Auto-Label / Grouping Policy (50 = MAX ungrouped ceiling) + List Contract date: 2026-05-31
04 — Auto-Label / Grouping Policy
Incorporates BOTH addenda, especially the threshold clarification: 50 is a MAXIMUM ungrouped display threshold, not a target. It is NOT "wait until 50, then classify."
Correct interpretation (binding)
- If a list already has classification/labels/groups → reuse the existing grouping immediately (do not wait).
- If a list has no classification and is becoming too long to inspect safely → classification must start immediately, before it becomes unmanageable.
- Default maximum ungrouped threshold = 50 rows, but some species/list types require a smaller threshold (
max_ungrouped_thresholdis per-species, configurable in PG). - Goal: every displayed layer stays short, inspectable, scalable.
- Pagination is not enough. Pagination helps the UI; semantic grouping/labeling is required when a list is too long.
- Labels MUST be PG-backed, governed, pivot-countable, and themselves registered. No frontend hardcoded label arrays.
Live anchors (REUSE/EXTEND — Đ24 Label Law v1.3 already governs this)
label_rules(38):condition jsonb→result_label,priority,status, and cruciallyskip_wide_warning bool— a wide-list concept already exists in the rule engine. REUSE as the classification rule store.taxonomy(58) /taxonomy_facets(10) /taxonomy_matrix(36): governed label tree (Đ24, cycle-check depth<5).taxonomy_facets.max_labels_per_entity+cardinalityalready model grouping cardinality.taxonomy_matrix(facet_id, composition_level, requirement)says which facet is required at which composition level → the natural source forchild_grouping_dimension. REUSE.entity_labels(718,744): applied entity↔label assignments (entity_code,label_code,rule_id). Labels are heavily used in practice — this is live, not theoretical. REUSE.species_collection_map(164):discriminator_field/value/operator→ grouping a collection by an intrinsic dimension. REUSE.collection_groups(9) +entity_species(42, hasparent_id+depth): existing grouping nodes. REUSE.
→ No new label store is needed. The work is: (a) carry the threshold/decision on the list contract, (b) ensure each grouping label resolves to a registered taxonomy/label_rules row, (c) make group counts pivot-backed.
Decision algorithm (server-side; Nuxt holds no thresholds)
classify_list(node):
if has_existing_grouping(node): # taxonomy/label_rules/species_collection_map present
classification_status := 'classified'
return reuse_grouping(node) # immediately
threshold := max_ungrouped_threshold(node.species) # PG; default 50, may be smaller
if node.children_count <= threshold:
classification_status := 'inspectable'
grouping_required := false
return show_children_directly(node)
# too long, no classification → must act now (not at 50, but as soon as it exceeds the species ceiling)
classification_status := 'classification_required'
grouping_required := true
grouping_reason := 'exceeds max_ungrouped_threshold(' || threshold || ')'
dim := suggest_dimension(node) # from taxonomy_matrix / facets / species attrs
if dim is null:
label_status := 'LABEL_MISSING' # propose label generation, governed
emit system_issues(issue_type='label_missing', ...) # propose type
suggested_next_grouping := dim
classification_workflow_trigger := 'label.classify' (design; see doc 06)
List/node contract additions (from threshold addendum)
classification_status (classified | inspectable | classification_required | LABEL_MISSING) · classification_dimension · label_ref (→ taxonomy.code / label_rules.id) · grouping_required · grouping_reason · max_ungrouped_threshold (per-species, default 50) · suggested_next_grouping · classification_workflow_trigger.
Governance & no-hardcode
- Labels live in
taxonomy/label_rules(PG), are counted by pivots (PIV-016species, label pivots PIVOT_MISSING → proposePIV-31x label-by-facet), and are governed by Đ24 (provenance FAC-PROV, status lifecycle, replaced_by). - No-hardcode test T-LABEL-1: every
label_refon a node resolves to a livetaxonomy/label_rulesrow; any literal label array inweb/⇒system_issues('hardcode_violation'). - T-LABEL-2:
max_ungrouped_thresholdis read from PG per species, never a Nuxt constant; a hardcoded50in frontend code is itself a violation (the ceiling is data). - T-LABEL-3: group counts are
pivot_count/pivot_queryvalues orPIVOT_MISSING.
Acceptance impact (doc 10)
PASS only if the design treats 50 as a maximum safety ceiling (per-species, can be smaller), reuses existing grouping immediately, starts classification before lists become unmanageable, keeps labels PG-backed/governed/pivot-countable, and forbids frontend label arrays and a hardcoded threshold.