P3D Pack 1 Phase 5C1 — Species Identity Decision + rev1 Prep Report
P3D Pack 1 Phase 5C1 — Species Identity Decision + rev1 Prep Report
Date: 2026-05-11 Author: Opus 4.7 Mode: DECISION MEMO + 5C1 PROMPT REV1 PREP ONLY — no execution
1. Status flags
phase5c1_prep_status = PASS
mode = DECISION_MEMO_AND_PROMPT_REV1_PREP_ONLY
species_exact_identity_locked = false
prompt_dispatch_allowed = false
agent_dispatch_allowed = false
seed_allowed = false
backfill_allowed = false
migration_allowed = false
required_column_fill_matrix_added = true
pre_update_capture_added = true
taxonomy_parent_hard_stop_added = true
field_unresolved_stop_added = true
agent_cannot_choose_species_values = true
requires_GPT_User_decision = true
2. Artifact paths
| Artifact | Path |
|---|---|
| Species identity decision memo | design/p3d-pack1-phase5c1-species-identity-decision-memo.md |
| 5C1 prompt (patched to rev1-prep) | prompts/p3d-pack1-phase5c1-species-mapping-qt001-backfill-prompt-DRAFT.md (revision 2) |
| This report | reports/p3d-pack1-phase5c1-species-identity-and-rev1-prep-report.md |
3. Top 7 changes/decisions
Change 1: Species identity decision memo created
Memo presents 3 naming options (A/B/C), recommends Option A (information_unit_atom / SPE-IUA / Đơn vị Thông tin). Lists all 11+ fields requiring GPT/User decision. Includes taxonomy parent strategy (F.1 live lookup / F.2 new root / F.3 GPT prescribes — recommends F.3).
Change 2: Fill policy matrix added to 5C1 prompt
Every NOT NULL column in entity_species mapped to a fill source: PROVIDED_BY_GPT, DB_AUTO, DB_DEFAULT, LIVE_LOOKUP, or PROVIDED_BY_GPT_OR_NULL. Rule: if Agent encounters a NOT NULL column without default that is NOT in the matrix → FIELD_UNRESOLVED_STOP (hard abort). Agent cannot invent values.
Same matrix for species_collection_map.
Change 3: FIELD_UNRESOLVED_STOP as hard STOP mechanism
Added to §4.2 validation procedure: any unexpected required column triggers ABORT before any write. This is the core safety mechanism preventing Agent improvisation.
Change 4: SELECT-before-UPDATE for QT-001 backfill
Step 7 restructured per GPT directive:
- SELECT target keys →
captured_backfill_targets - Persist captured targets to KB + VPS log BEFORE mutation
- UPDATE WHERE PK IN captured list
- Cross-check via re-SELECT (RETURNING is verification, NOT primary capture)
- Verify 0 NULL remaining
This ensures backfill targets are known and persisted before any row is modified.
Change 5: Taxonomy parent hard STOP
§3.4: if GPT-prescribed parent species_code doesn't exist → ABORT. If ambiguous (>1 match) → ABORT. Agent does NOT pick an alternative parent.
Change 6: Agent cannot choose species values (explicit in §1)
Added CRITICAL statement: "Agent MUST NOT choose species name, code, prefix, parent, or any GPT-decided value." Plus G0-10: verify all §9 placeholders are resolved before any write.
Change 7: species_code vs entity_code clarified
Decision memo §C clarifies from Phase 5A evidence: code column = entity prefix (SPE-XXX pattern), species_code column = semantic name (lowercase_underscore pattern). Fill matrix maps both explicitly.
4. Hardcode / hidden-hardcode audit
Rà soát nghiêm ngặt rev1-prep prompt:
| Potential issue | Found? | Verdict |
|---|---|---|
| Literal column name in executable SQL | NO — all use <concept col> or <PK col> |
CLEAN |
| Hardcoded count as gate | NO — all counts live; Phase 5A "12" appears only as reference | CLEAN |
| Species values hardcoded in executable form | NO — all <DECIDED_BY_GPT> |
CLEAN |
| Fill policy lets Agent fill unknown columns | NO — FIELD_UNRESOLVED_STOP for any unmapped NOT NULL | CLEAN |
| Agent can pick taxonomy parent | NO — §3.4 requires GPT-prescribed code, ABORT if not found | CLEAN |
| RETURNING as primary capture | NO — SELECT-before-UPDATE is primary; RETURNING is cross-check | CLEAN |
proposed_composition_level = 'atom' hardcoded |
YES — GPT LOCKED decision, not Agent-chosen | ACCEPTABLE as GPT design constant |
proposed_management_mode = 'observed' hardcoded |
YES — GPT LOCKED | ACCEPTABLE |
proposed_status = 'active' hardcoded |
YES — standard convention | ACCEPTABLE |
target_collection_primary = 'information_unit' |
YES — scope constant | ACCEPTABLE |
| JSON key patterns in fill matrix column names | NO — column concepts from Phase 5A registry | CLEAN |
Ẩn hoạ tiềm ẩn duy nhất: fill matrix dùng "Expected column" names (e.g., species_code, display_name, code). Đây là dự đoán dựa trên Phase 5A evidence. Nếu actual column names khác → Agent phải match by concept (semantic registry) chứ không match by literal name. §4.2 validation procedure xử lý case này: Agent maps introspected columns → registry concepts → fill matrix. Nếu mapping fails → FIELD_UNRESOLVED_STOP.
Kết luận: không có hardcode ẩn ảnh hưởng scale. Mọi giá trị phụ thuộc vào runtime introspection + GPT-locked constants.
5. Unblock path
Step 1: GPT/User reviews decision memo → approves Option A (or amends) → locks 8 species identity values
Step 2: Opus fills §9 placeholders in 5C1 prompt → produces rev1 (dispatch-ready)
Step 3: GPT final review of 5C1 rev1
Step 4: User GO
Step 5: Agent dispatch 5C1
Step 6: GPT/User reviews 5C1 report → ACCEPT
Step 7: → 5C2 unblocked (plus publication_authority_ref resolution)
6. Confirmation: GPT/User review required
✓ No DB write, no Agent dispatch
✓ Decision memo created with options + recommendations
✓ 5C1 prompt patched with fill policy matrix + SELECT-before-UPDATE + hard STOPs
✓ All species values remain <DECIDED_BY_GPT>
✓ prompt_dispatch_allowed = false
✓ agent_dispatch_allowed = false
Phase 5C1 Species Identity + rev1 Prep Report | Fill policy added | SELECT-before-UPDATE added | FIELD_UNRESOLVED_STOP added | GPT/User decision required | 2026-05-11