P3D Pack 1 Phase 4C — Read-Only/Dry-Run Species Mapping + QT-001 Mission Prompt (rev3)
P3D Pack 1 Phase 4C — Read-Only/Dry-Run Species Mapping + QT-001 Mission Prompt (rev3)
Date: 2026-05-11 | Mode: READ-ONLY / DRY-RUN — no writes, no seeds, no backfill Style: MISSION PROMPT — scope + preflight + goals + deterministic criteria + scale guards Legal alignment: Điều 0-B, Điều 0-G, Điều 29, QT-001, QT-005 Reference snapshots: Phase 4B discovery report (labels only, re-verify live before use)
0. SCOPE BLOCK (declared once — Pack 1 only)
These are the ONLY literal identifiers in this prompt. They are scope definitions from approved Pack 1 design artifacts:
target_collection_primary = 'information_unit'
target_collection_subordinate = 'unit_version'
target_unit_kind_committed = 'law_unit'
discriminator_candidate_column = 'unit_kind'
target_vocab_key = 'vocab.unit_kind.law_unit'
No other literal decision values appear as selectors or decision criteria in this prompt. Semantic field labels such as species_code, composition_level, management_mode, and governance_role may appear only as candidate field names to verify via introspection; if absent, report FIELD_ABSENT and do not substitute from memory. All actual values MUST be derived from live PG at execution time.
1. GATE-0 — Scope preflight (STOP if any fails)
Before any goal, verify that scope constants are valid against live PG. If any check fails → STOP, report which check failed, do not proceed.
| Check | What to verify | Source |
|---|---|---|
| P1 | Table target_collection_primary exists in PG |
pg_class WHERE relname = <target> AND relkind = 'r' |
| P2 | Table target_collection_subordinate exists in PG |
same |
| P3 | collection_registry has a row for target_collection_primary |
SELECT count(*) FROM collection_registry WHERE collection_name = <target> |
| P4 | collection_registry has a row for target_collection_subordinate |
same |
| P5 | Column discriminator_candidate_column exists on target_collection_primary |
information_schema.columns WHERE table_name = <target> AND column_name = <disc_col> |
| P6 | dot_config has key matching target_vocab_key |
SELECT count(*) FROM dot_config WHERE key = <vocab_key> |
All 6 must pass. Report each result. If P5 fails → discriminator strategy collapses (non-fatal for other goals, but mark discriminator goals as BLOCKED).
2. AUTHORITATIVE SOURCES
| Class of value | Live source |
|---|---|
| Species properties (any column) | public.entity_species — introspect columns first, use SELECT * for data |
| Collection→species mappings | public.species_collection_map — introspect columns first |
| Birth function logic | pg_get_functiondef for fn_birth_registry_auto |
| Birth registry state | public.birth_registry — introspect columns first |
| Collection governance properties | public.collection_registry — introspect columns first |
| Composition allowed values | SELECT DISTINCT composition_level FROM entity_species (live, not listed) |
| Management mode values | SELECT DISTINCT management_mode FROM entity_species (live) |
| Governance role values | SELECT DISTINCT governance_role FROM collection_registry (live) |
| Birth code strategy values | SELECT DISTINCT birth_code_strategy FROM collection_registry (live) |
Column name rule: Before querying any table with specific column names, introspect that table's columns via information_schema.columns. If a column is absent → skip any query depending on it + report FIELD_ABSENT:<table>.<column>.
3. GOALS (G1–G10)
G1 — Discriminator usage in production
- Source:
species_collection_mapintrospected columns. - Selector: Introspect column names first. Then for each of the discriminator-candidate columns (
discriminator_field,discriminator_value,discriminator_config— verify existence before querying), count rows where that column IS NOT NULL. - Skip/STOP: If discriminator columns don't exist in the table → report
FIELD_ABSENTfor each, concludediscriminator_schema_support=false. - Output: Per discriminator column: exists (yes/no), non-NULL count, first 5 sample rows with non-NULL values. If all 0 →
discriminator_production_active=false.
G2 — fn_birth_registry_auto discriminator handling
- Source:
pg_get_functiondeffor functionfn_birth_registry_autoinpublicschema. - Selector: Search function source text for any of the discriminator column names found in G1 (or the candidate names if G1 found FIELD_ABSENT). Also search for references to
NEW.<discriminator_candidate_column>or similar record-field access. - Skip/STOP: If function not found → STOP, report blocker.
- Output: Function source excerpt (relevant lines) + conclusion:
uses_discriminators=true|false+reads_record_fields_beyond_code=true|false.
G3 — Complete species landscape
- Source:
entity_speciesintrospected columns. - Selector:
SELECT * FROM entity_species ORDER BY 1(all rows, all columns — NO assumed column names). - Output:
- Full entity_species dump (all columns as they exist).
- Semantic field presence map: For each column present, classify as
IDENTITY(id/code/name),CLASSIFICATION(species_code/composition/management),HIERARCHY(parent/depth),OTHER. Agent derives this from column names + data types — does NOT assume a fixed field list. - Total count (snapshot label).
- Aggregation: for each column that appears to be a categorical classification field (non-unique, non-numeric), show distinct values + counts. Agent determines which columns are categorical from the data, not from prompt instruction.
G4 — Collection-registry context for target collections
- Source:
collection_registryintrospected columns. - Selector:
WHERE collection_name IN (target_collection_primary, target_collection_subordinate). - Output: Full rows verbatim (
SELECT *). Agent highlights any column that is NULL or absent.
G5 — Birth-registry state for target collections
- Source:
birth_registryintrospected columns. - Selector:
WHERE collection_name IN (target_collection_primary, target_collection_subordinate). - Output: Count per target. Count of rows where species_code column (if exists) IS NULL. First 20 rows verbatim.
G6 — Discriminator column candidate verification on target
- Source:
information_schema.columns+ live data fromtarget_collection_primary. - Selector: Confirm column
discriminator_candidate_columnexists ontarget_collection_primary(already checked in GATE-0 P5). Then query distinct values of that column with counts. - Skip/STOP: If P5 failed → skip G6, mark discriminator strategy as BLOCKED.
- Output: Distinct values + counts from the discriminator candidate column. These are the population of possible discriminator_value entries for future species_collection_map rows.
G7 — Collections sharing exact birth strategy with subordinate target
- Source:
collection_registry. - Selector:
- Query the exact
birth_code_strategyvalue oftarget_collection_subordinatefromcollection_registry. Call thisv_sub_strategy. - Find ALL other collections with the same exact
birth_code_strategy = v_sub_strategy. - For each of those collections, check if they have a
species_collection_maprow.
- Query the exact
- No fuzzy matching. No "subordinate-like." Only exact value match.
- Skip/STOP: If
birth_code_strategycolumn doesn't exist oncollection_registry→ FIELD_ABSENT, skip. - Output:
v_sub_strategyvalue + count of collections with that exact strategy + count of those with species mapping + first 10 sample rows.
G8 — Dry-run: candidate species options with deterministic labels
For EACH species from G3, compute compatibility with target_collection_primary:
- Source: G3 species data + G4 collection_registry data.
- Compatibility criteria (deterministic):
| Label | Condition |
|---|---|
PLAUSIBLE |
Species has a management_mode column value that matches the governance_role value of target_collection_primary (both from live queries). Both values exist and are equal or compatible per Điều 29 (observed↔observed, governed↔governed). |
CONFLICT |
Both values exist but DON'T match (e.g., species management_mode=governed, target governance_role=observed). |
UNKNOWN |
Either the management_mode column or the governance_role value is NULL, absent, or not found. |
N/A |
Species or collection_registry row is missing required field for comparison — report FIELD_ABSENT. |
- Output: Per species: all columns from G3 + proposed_composition (from the species's composition_level field, if field exists) + compatibility_label (PLAUSIBLE/CONFLICT/UNKNOWN/N/A) + compatibility_reason.
Label EVERY row: candidate_not_approved.
G9 — QT-001 backfill dry-run per PLAUSIBLE candidate
For each species labelled PLAUSIBLE in G8:
- Source:
birth_registryrows fortarget_collection_primarywhere species_code IS NULL (from G5). - Computation: For each such row, project:
entity_code, current_species_code (NULL), proposed_species_code, current_composition_level, proposed_composition_level. - Proposed values come from: the candidate species's species_code + entity_species.composition_level for that species.
- If no PLAUSIBLE candidates exist: Report "0 PLAUSIBLE candidates. QT-001 dry-run skipped." Also dry-run for the first 3 CONFLICT candidates with explicit note:
conflict_dryrun_for_information_only. - Output: Per PLAUSIBLE candidate: count of rows that would be updated + first 5 rows showing current→proposed.
Label EVERY result: candidate_not_approved.
G10 — TAC unit_kind distribution snapshot
- Source:
tac_logical_unitintrospected columns. - Selector: Introspect whether
tac_logical_unithas a column matchingdiscriminator_candidate_column(i.e.,unit_kind). If not, try common alternatives (introspect all columns, look for categorical text columns with few distinct values). - Skip/STOP: If no suitable column found → report
FIELD_ABSENT, skip. This means TAC data cannot inform discriminator strategy. - Output: Distinct values + counts of the discovered column. Flag which value matches
target_unit_kind_committed.
4. ACCEPTANCE CRITERIA (PASS/FAIL)
| # | Criterion | PASS condition |
|---|---|---|
| 1 | GATE-0 all 6 preflight checks reported | Each check shows PASS or FAIL with evidence |
| 2 | Goals G1-G10 all attempted | Each reports evidence, FIELD_ABSENT skip, or explicit BLOCKED reason |
| 3 | Zero writes | Agent confirms no INSERT/UPDATE/DELETE/DDL/function-patch |
| 4 | No species/field names from memory | All names in output come from live PG queries |
| 5 | Column introspection before schema-sensitive queries | Every table introspected; FIELD_ABSENT reported if column missing |
| 6 | Phase 4B snapshots re-verified | Agent re-queries any Phase 4B finding before using as fact |
| 7 | All candidate outputs labelled | candidate_not_approved on every dry-run row |
| 8 | G8 uses deterministic labels | Every species gets exactly one of: PLAUSIBLE / CONFLICT / UNKNOWN / N/A |
| 9 | G7 uses exact strategy match | No "subordinate-like" — only exact birth_code_strategy value |
| 10 | Scale brittleness self-audit passed | §6 below |
5. CONSTRAINTS
- No writes. Read-only. Dry-run only.
- No literal decision values beyond §0 scope block. Semantic column labels may be used only as candidate fields after introspection. All actual values derive live.
- No assumed column existence. Introspect first.
SELECT *for data dumps.FIELD_ABSENTfor missing columns. - No fuzzy selectors. No "-like", "similar to", "equivalent" language for matching. Exact values only.
- No subjective plausibility. Use deterministic labels (§G8 criteria table).
- Re-verify snapshots. Phase 4B labels are starting points only.
- Mission-style. Agent writes its own queries to satisfy goals and constraints.
6. SCALE BRITTLENESS SELF-AUDIT
Before submitting report, agent confirms:
| Check | Required answer |
|---|---|
| Did I list any species/composition/role value from memory? | No — all from live queries. |
| Did I list any column name as "required" without verifying it exists? | No — introspected or SELECT *. |
| Did I use any "-like" or semantic fuzzy match for selectors? | No — exact values only. |
| Did I assume fixed counts? | No — all counts are live snapshots. |
| Did I assume fixed field lists for output shapes? | No — used SELECT * or introspected. |
| Would my queries adapt if a column was renamed? | Yes — I'd report FIELD_ABSENT and skip. |
| Would my queries adapt if 50 new species were added? | Yes — G3 fetches all with SELECT *. |
| Did I report Phase 4B labels as facts without re-verification? | No. |
| Did I apply subjective judgment in G8/G9 labels? | No — used deterministic criteria only. |
7. REPORT REQUIREMENTS
Save to: knowledge/dev/laws/dieu44-trien-khai/reports/p3d-pack1-phase4c-species-mapping-dryrun-report.md
Must include:
- RUN_MARKER.
- GATE-0 preflight results (6 checks).
- SCOPE BLOCK echoed from §0.
- G1-G10 results with evidence, labels, FIELD_ABSENT notes.
- Column introspection results per table (what columns exist, what was FIELD_ABSENT).
- Scale brittleness self-audit (§6 answers).
- Phase 4B re-verification: which findings still hold, which differ.
- Status flags:
phase4c_dryrun_status=PASS|PARTIAL|BLOCKED
no_mutation_performed=true
gate0_all_passed=true|false
discriminator_runtime_active=true|false
discriminator_column_exists_on_target=true|false
species_decision_locked=false
composition_decision_locked=false
plausible_candidate_count=<N>
conflict_candidate_count=<N>
unknown_candidate_count=<N>
scale_brittleness_self_audit_passed=true|false
8. OUT OF SCOPE
- Do not propose function patches.
- Do not seed species_collection_map.
- Do not backfill birth_registry.
- Do not dispatch nested agents.
- Do not use fuzzy ILIKE for selector logic.
- Do not list field names from memory — introspect live.
- Do not apply subjective labels — use deterministic criteria.
Phase 4C Mission Prompt rev3 | GATE-0 preflight | Field-adaptive SELECT * | Deterministic labels | No fuzzy | No field hardcode | 2026-05-11