KB-FC65

04 — Auto-Label / Grouping Policy (50 = MAX Ungrouped Ceiling) + List Contract

6 min read Revision 1
architectureauto-labelgroupinglabel-lawdieu-24taxonomymax-ungrouped-thresholdno-hardcode2026-05-31

title: 04 — Auto-Label / Grouping Policy (50 = MAX ungrouped ceiling) + List Contract date: 2026-05-31

04 — Auto-Label / Grouping Policy

Incorporates BOTH addenda, especially the threshold clarification: 50 is a MAXIMUM ungrouped display threshold, not a target. It is NOT "wait until 50, then classify."

Correct interpretation (binding)

  1. If a list already has classification/labels/groups → reuse the existing grouping immediately (do not wait).
  2. If a list has no classification and is becoming too long to inspect safely → classification must start immediately, before it becomes unmanageable.
  3. Default maximum ungrouped threshold = 50 rows, but some species/list types require a smaller threshold (max_ungrouped_threshold is per-species, configurable in PG).
  4. Goal: every displayed layer stays short, inspectable, scalable.
  5. Pagination is not enough. Pagination helps the UI; semantic grouping/labeling is required when a list is too long.
  6. Labels MUST be PG-backed, governed, pivot-countable, and themselves registered. No frontend hardcoded label arrays.

Live anchors (REUSE/EXTEND — Đ24 Label Law v1.3 already governs this)

  • label_rules(38): condition jsonbresult_label, priority, status, and crucially skip_wide_warning bool — a wide-list concept already exists in the rule engine. REUSE as the classification rule store.
  • taxonomy(58) / taxonomy_facets(10) / taxonomy_matrix(36): governed label tree (Đ24, cycle-check depth<5). taxonomy_facets.max_labels_per_entity + cardinality already model grouping cardinality. taxonomy_matrix(facet_id, composition_level, requirement) says which facet is required at which composition level → the natural source for child_grouping_dimension. REUSE.
  • entity_labels(718,744): applied entity↔label assignments (entity_code,label_code,rule_id). Labels are heavily used in practice — this is live, not theoretical. REUSE.
  • species_collection_map(164): discriminator_field/value/operator → grouping a collection by an intrinsic dimension. REUSE.
  • collection_groups(9) + entity_species(42, has parent_id+depth): existing grouping nodes. REUSE.

No new label store is needed. The work is: (a) carry the threshold/decision on the list contract, (b) ensure each grouping label resolves to a registered taxonomy/label_rules row, (c) make group counts pivot-backed.

Decision algorithm (server-side; Nuxt holds no thresholds)

classify_list(node):
    if has_existing_grouping(node):                       # taxonomy/label_rules/species_collection_map present
        classification_status := 'classified'
        return reuse_grouping(node)                        # immediately
    threshold := max_ungrouped_threshold(node.species)    # PG; default 50, may be smaller
    if node.children_count <= threshold:
        classification_status := 'inspectable'
        grouping_required := false
        return show_children_directly(node)
    # too long, no classification → must act now (not at 50, but as soon as it exceeds the species ceiling)
    classification_status := 'classification_required'
    grouping_required := true
    grouping_reason := 'exceeds max_ungrouped_threshold(' || threshold || ')'
    dim := suggest_dimension(node)                         # from taxonomy_matrix / facets / species attrs
    if dim is null:
        label_status := 'LABEL_MISSING'                    # propose label generation, governed
        emit system_issues(issue_type='label_missing', ...)   # propose type
    suggested_next_grouping := dim
    classification_workflow_trigger := 'label.classify' (design; see doc 06)

List/node contract additions (from threshold addendum)

classification_status (classified | inspectable | classification_required | LABEL_MISSING) · classification_dimension · label_ref (→ taxonomy.code / label_rules.id) · grouping_required · grouping_reason · max_ungrouped_threshold (per-species, default 50) · suggested_next_grouping · classification_workflow_trigger.

Governance & no-hardcode

  • Labels live in taxonomy / label_rules (PG), are counted by pivots (PIV-016 species, label pivots PIVOT_MISSING → propose PIV-31x label-by-facet), and are governed by Đ24 (provenance FAC-PROV, status lifecycle, replaced_by).
  • No-hardcode test T-LABEL-1: every label_ref on a node resolves to a live taxonomy/label_rules row; any literal label array in web/system_issues('hardcode_violation').
  • T-LABEL-2: max_ungrouped_threshold is read from PG per species, never a Nuxt constant; a hardcoded 50 in frontend code is itself a violation (the ceiling is data).
  • T-LABEL-3: group counts are pivot_count/pivot_query values or PIVOT_MISSING.

Acceptance impact (doc 10)

PASS only if the design treats 50 as a maximum safety ceiling (per-species, can be smaller), reuses existing grouping immediately, starts classification before lists become unmanageable, keeps labels PG-backed/governed/pivot-countable, and forbids frontend label arrays and a hardcoded threshold.

Back to Knowledge Hub knowledge/dev/reports/architecture/registries-pivot-os-agency-count-integrity-orphan-phantom-label-pin-rehearsal-2026-05-31/04-auto-label-grouping-policy.md