KB-7B84

S178-Fix30 Backfill Enrichment Stats

6 min read Revision 1
S178-Fix30H11bbackfillenrichmentproduction-query

S178-Fix30 Backfill Enrichment Stats

Date: 2026-04-23 Host: contabo DB: directus / Postgres container postgres Purpose: thống kê chính xác scope backfill enrichment sau khi loại operational species system_issue theo dot_config.h11b_exclude_species.

Q1. Tổng entity cần enrichment sau khi loại system_issues

SELECT count(*) AS enrichment_total
FROM v_entity_full_classification v
WHERE v.governance_role = 'governed'
  AND v.description IS NOT NULL
  AND btrim(v.description) <> ''
  AND (v.description_provenance = 'PROV-DOT' OR v.description_provenance IS NULL)
  AND v.species_code NOT IN (
    SELECT jsonb_array_elements_text(COALESCE(
      (SELECT value::jsonb FROM dot_config WHERE key = 'h11b_exclude_species'),
      '[]'::jsonb
    ))
  );

Output nguyên văn:

 enrichment_total 
------------------
              364
(1 row)

Q2. Phân bổ theo table + species + composition_level

SELECT v.source_table, v.species_code, v.composition_level, count(*) AS cnt
FROM v_entity_full_classification v
WHERE v.governance_role = 'governed'
  AND v.description IS NOT NULL
  AND btrim(v.description) <> ''
  AND (v.description_provenance = 'PROV-DOT' OR v.description_provenance IS NULL)
  AND v.species_code NOT IN (
    SELECT jsonb_array_elements_text(COALESCE(
      (SELECT value::jsonb FROM dot_config WHERE key = 'h11b_exclude_species'),
      '[]'::jsonb
    ))
  )
GROUP BY v.source_table, v.species_code, v.composition_level
ORDER BY cnt DESC;

Output nguyên văn:

    source_table     | species_code  | composition_level | cnt 
---------------------+---------------+-------------------+-----
 collection_registry | collection    | molecule          | 147
 meta_catalog        | catalog       | atom              | 130
 ui_pages            | page          | molecule          |  37
 entity_species      | species       | meta              |  23
 dot_tools           | dot_tool      | atom              |  10
 workflow_steps      | workflow_step | molecule          |   8
 agents              | agent         | atom              |   6
 tasks               | task          | compound          |   3
(8 rows)

Q3. 7 rows species_code NULL

SELECT v.entity_code, v.source_table, v.name, 
       v.description_provenance, left(v.description, 80) AS desc_preview
FROM v_entity_full_classification v
WHERE v.governance_role = 'governed'
  AND v.description IS NOT NULL
  AND btrim(v.description) <> ''
  AND (v.description_provenance = 'PROV-DOT' OR v.description_provenance IS NULL)
  AND v.species_code IS NULL;

Output nguyên văn:

          entity_code          |  source_table  |           name            | description_provenance |                    desc_preview                     
-------------------------------+----------------+---------------------------+------------------------+-----------------------------------------------------
 tbl_registry_dot_tools        | table_registry | DOT Tools                 | PROV-DOT               | [published] Phân tử (chưa có) trong table_registry.
 tbl_registry_ui_pages         | table_registry | Pages/Routes              | PROV-DOT               | [published] Phân tử (chưa có) trong table_registry.
 tbl_registry_collections      | table_registry | Collections               | PROV-DOT               | [published] Phân tử (chưa có) trong table_registry.
 tbl_registry_agents           | table_registry | Agents                    | PROV-DOT               | [published] Phân tử (chưa có) trong table_registry.
 tbl_registry_modules          | table_registry | Modules                   | PROV-DOT               | [published] Phân tử (chưa có) trong table_registry.
 tbl_registry_checkpoint_types | table_registry | Checkpoint Types Registry | PROV-DOT               | [published] Phân tử (chưa có) trong table_registry.
 tbl_registry_checkpoint_sets  | table_registry | Checkpoint Sets Registry  | PROV-DOT               | [published] Phân tử (chưa có) trong table_registry.
(7 rows)

Q4. Executor đã chạy sau fix chưa? H11b findings mới nhất?

SELECT key, value FROM dot_config WHERE key = 'hc_executor_last_run';
SELECT source, status, count(*) FROM system_issues 
WHERE source = 'H11b' GROUP BY source, status ORDER BY source, status;

Output nguyên văn:

         key          |           value           
----------------------+---------------------------
 hc_executor_last_run | 2026-04-23T04:01:05+00:00
(1 row)

 source |  status  | count 
--------+----------+-------
 H11b   | archived |     1
 H11b   | open     |     5
(2 rows)

Nhận xét ngắn

  • Scope enrichment hiện tại: 364 entity.
  • Batch lớn nhất: collection_registry 147, meta_catalog 130, ui_pages 37.
  • 7 row species_code NULL là các table_registry legacy; do query H11b mới dùng v.species_code NOT IN (...), các row NULL không nằm trong enrichment_total.
  • Executor heartbeat gần nhất có sau H11b scope fix (2026-04-23T04:01:05+00:00), nhưng system_issues vẫn còn 5 issue H11b open tại thời điểm query.