KB-661A rev 2

P3D Pack 1 Phase 5C1A — Read-Only Species Schema + Parent Probe (rev2)

11 min read Revision 2
p3dpack1phase5c1aproberev2readonlyintrospection-firstspecies-schema

P3D Pack 1 Phase 5C1A — Read-Only Species Schema + Parent Probe (rev2)

Mode: READ-ONLY — zero writes, zero mutation Rev2: introspection-first; no assumed column names in queries Purpose: Discover live schema, constraints, taxonomy, and collision evidence for 5C1 species identity decision DB target: VPS PostgreSQL (postgres container, db=directus, schema=public) Report to: knowledge/dev/laws/dieu44-trien-khai/reports/p3d-pack1-phase5c1a-species-schema-parent-probe-report.md


HARD BOUNDARIES

No INSERT. No UPDATE. No DELETE. No DDL. No seed. No backfill. No migration. No function/trigger patch. No nested agent dispatch. No species decision locked by Agent. All candidate rows labelled candidate_not_approved.


EXECUTION ORDER

This probe runs in 3 phases. Phase 2 depends on Phase 1 results. Phase 3 depends on Phase 2 results. Do NOT skip phases or assume column names before resolution.

PHASE 1 — Schema introspection (information_schema only, no assumed column names)
PHASE 2 — Concept→column resolution (build field map from Phase 1 results)
PHASE 3 — Evidence queries (using ONLY resolved column names from Phase 2)

PHASE 1 — Schema introspection

Use ONLY information_schema queries. No table-specific column names assumed.

Q1 — entity_species full schema

SELECT column_name, data_type, is_nullable, column_default, character_maximum_length
FROM information_schema.columns
WHERE table_schema='public' AND table_name='entity_species'
ORDER BY ordinal_position;

Report: full result as table.

Q2 — entity_species constraints

SELECT tc.constraint_name, tc.constraint_type, kcu.column_name
FROM information_schema.table_constraints tc
JOIN information_schema.key_column_usage kcu
  ON tc.constraint_name = kcu.constraint_name AND tc.table_schema = kcu.table_schema
WHERE tc.table_schema='public' AND tc.table_name='entity_species'
ORDER BY tc.constraint_type, kcu.column_name;

Report: full result. Note which columns have UNIQUE or PRIMARY KEY.

Q3 — species_collection_map full schema

SELECT column_name, data_type, is_nullable, column_default, character_maximum_length
FROM information_schema.columns
WHERE table_schema='public' AND table_name='species_collection_map'
ORDER BY ordinal_position;

Q4 — species_collection_map constraints

Same pattern as Q2 for species_collection_map.

Q5 — birth_registry full schema

SELECT column_name, data_type, is_nullable, column_default, character_maximum_length
FROM information_schema.columns
WHERE table_schema='public' AND table_name='birth_registry'
ORDER BY ordinal_position;

PHASE 2 — Concept→column resolution

From Phase 1 results, resolve each concept below to an actual column name. Use ONLY the column_name values returned by Q1/Q3/Q5.

Resolution rules

For each concept, a candidate list is provided.
Match candidates against actual column_names from Phase 1.
  0 matches → FIELD_ABSENT
  1 match   → RESOLVED (use that column)
  >1 match  → AMBIGUOUS_FIELD (report all matches; do not pick one)

entity_species concepts

Concept Candidate column names Resolution
species_pk id, uuid, species_id
species_identifier species_code, species_name, identifier
species_entity_code code, entity_code, prefix_code
species_display_label display_name, name, label, title
species_composition composition_level, composition, level
species_management management_mode, management, mode
species_status status, is_active, state
species_depth depth, level, tree_depth
species_parent_ref parent_id, parent, parent_ref, parent_species_id
species_prefix prefix, short_code, abbreviation
species_kg_metadata kg_metadata, metadata, kg_meta
species_dot_origin _dot_origin, dot_origin, origin

species_collection_map concepts

Concept Candidate column names Resolution
mapping_pk id, uuid, map_id
mapping_collection_key collection_name, collection, table_name
mapping_species_identifier species_code, species_name, species_id
mapping_primary_flag is_primary, primary, is_default
mapping_disc_field discriminator_field, disc_field
mapping_disc_value discriminator_value, disc_value
mapping_disc_operator discriminator_operator, disc_operator
mapping_disc_config discriminator_config, disc_config

birth_registry concepts

Concept Candidate column names Resolution
birth_pk id, uuid, birth_id
birth_collection_key collection_name, collection, table_name
birth_species_identifier species_code, species_name, species
birth_composition composition_level, composition
birth_entity_key entity_code, entity_id, entity_key

Report Phase 2 results

For each concept:
  concept | resolved_column | status (RESOLVED / FIELD_ABSENT / AMBIGUOUS_FIELD)

If any concept needed by Phase 3 queries is FIELD_ABSENT or AMBIGUOUS_FIELD → report that dependent query result as UNKNOWN. Do NOT guess a column.


PHASE 3 — Evidence queries (using ONLY resolved column names)

All queries below use <concept> placeholders. Replace each with the Phase 2 resolved column name. If the concept was not RESOLVED → skip the query and report UNKNOWN.

Q6 — Collision check

SELECT <species_pk>, <species_identifier>, <species_entity_code>, <species_display_label>,
       <species_composition>, <species_management>, <species_status>
FROM entity_species
WHERE <species_identifier> = 'information_unit_atom'
   OR <species_entity_code> = 'SPE-IUA';

Report: row count + data. Expected 0. If >0 → collision exists.

Q7 — _dot_origin column policy

From Q1 Phase 1 results, check if species_dot_origin concept resolved. If RESOLVED:

  • Report: column name, data_type, is_nullable, column_default.

If FIELD_ABSENT:

  • Report: _dot_origin ABSENT — not a required field for INSERT.

Q8 — kg_metadata column policy

Same pattern as Q7 for species_kg_metadata concept.

Q9 — Taxonomy depth + parent distribution

If species_depth and species_parent_ref both RESOLVED:

-- Depth distribution
SELECT <species_depth>, count(*) AS species_count
FROM entity_species
GROUP BY <species_depth>
ORDER BY <species_depth>;

-- Full taxonomy dump with parent resolution
SELECT es.<species_pk>, es.<species_identifier>, es.<species_entity_code>,
       es.<species_display_label>, es.<species_depth>,
       es.<species_parent_ref>,
       p.<species_identifier> AS parent_species_identifier,
       p.<species_display_label> AS parent_display_label
FROM entity_species es
LEFT JOIN entity_species p ON es.<species_parent_ref> = p.<species_pk>
ORDER BY es.<species_depth>, es.<species_identifier>;

Report: full dump as table.

If either concept not RESOLVED → report UNKNOWN for taxonomy.

Q10 — Parent candidates

From Q9 results, list all depth-0 species as parent candidates. For each candidate, include: pk, identifier, entity_code, display_label, composition, management.

Label ALL as candidate_not_approved. Agent does NOT recommend a parent. Agent does NOT filter by "semantically covers" or "similar" — list all depth-0 rows.

If Q9 is UNKNOWN → report UNKNOWN.

Q11 — Existing mapping for information_unit

If mapping_collection_key RESOLVED:

SELECT * FROM species_collection_map
WHERE <mapping_collection_key> = 'information_unit';

Report: row count + data. Expected 0.

If concept not RESOLVED → UNKNOWN.

Q12 — birth_registry NULL species count

If birth_collection_key and birth_species_identifier both RESOLVED:

SELECT count(*) AS null_species_count
FROM birth_registry
WHERE <birth_collection_key> = 'information_unit'
  AND <birth_species_identifier> IS NULL;

Report: count. Phase 5A evidence was 12 — verify live.

If concept not RESOLVED → UNKNOWN.


BONUS — Live fill policy matrix

From Q1 Phase 1 results, build table:

column_name | data_type | is_nullable | has_default | fill_policy_category

Where fill_policy_category:
  DB_AUTO       → PK with serial/uuid default
  DB_DEFAULT    → non-null default exists
  NULLABLE_SKIP → is_nullable='YES', no default required
  NEEDS_VALUE   → is_nullable='NO', no default → must be in 5C1 fill matrix

Same for species_collection_map (from Q3).


REPORT FORMAT

Save to: knowledge/dev/laws/dieu44-trien-khai/reports/p3d-pack1-phase5c1a-species-schema-parent-probe-report.md

Required sections:

  • A. Phase 1 raw schema results (Q1–Q5)
  • B. Phase 2 concept→column resolution table (all concepts, RESOLVED/ABSENT/AMBIGUOUS)
  • C. Phase 3 evidence (Q6–Q12, using resolved names or UNKNOWN)
  • D. Bonus fill policy matrix
  • E. Status block

Final response status block:

phase5c1a_probe_status=PASS|PARTIAL|BLOCKED
report_path=knowledge/dev/laws/dieu44-trien-khai/reports/p3d-pack1-phase5c1a-species-schema-parent-probe-report.md
no_mutation_performed=true
entity_species_schema_discovered=true|false
mapping_schema_discovered=true|false
birth_registry_schema_discovered=true|false
phase2_resolution_complete=true|false
phase2_field_absent_count=<N>
phase2_ambiguous_count=<N>
proposed_species_collision=false|true|UNKNOWN
kg_metadata_policy=nullable|defaulted|required|absent|UNKNOWN
dot_origin_policy=nullable|defaulted|required|absent|UNKNOWN
parent_candidates_count=<N|UNKNOWN>
info_unit_mapping_exists=true|false|UNKNOWN
info_unit_null_species_count=<N|UNKNOWN>
fill_policy_matrix_complete=true|false
next_recommended_action=GPT/User locks species identity + parent from probe evidence then Opus writes 5C1 rev1

Phase 5C1A Probe rev2 | Introspection-first | No assumed columns | 3-phase: schema→resolve→query | 2026-05-11

Back to Knowledge Hub knowledge/dev/laws/dieu44-trien-khai/prompts/p3d-pack1-phase5c1a-readonly-species-schema-parent-probe-prompt.md