dot-iu-cutter v0.2 — canonical_address Reconciliation Discovery (2026-05-15)
dot-iu-cutter v0.2 — canonical_address Reconciliation Discovery
document_path: knowledge/dev/laws/dieu44-trien-khai/v0.2-planning/dot-iu-cutter-v0.2-canonical-address-reconciliation-discovery-2026-05-15.md
revision: r1
date: 2026-05-15
author: Agent (Claude Code CLI, Opus 4.7 1M)
sovereign: User / anh Huyên
verifier: GPT (Đ32 HIGH-risk path — anticipated)
secondary: Opus
phase: v0.2 planning — discovery (read-only)
mutation_performed: false
ddl_written: false
migration_started: false
§1 — Purpose
Read-only discovery of the existing production state of canonical_address in the directus database, to determine whether the v0.1 P0-1 design can be applied as-drafted, must be revised, or must be substantially rethought before any v0.2 DDL is authored.
Inputs that motivated this discovery:
- v0.1 production execution finding F-2 (pre-existing column).
- v0.2 scope backlog item v0.2-D-4 (canonical_address reconciliation).
- v0.1 P0-1 migration design (the intended-design baseline).
§2 — Read-Only Inspection Method
target_environment: VPS 38.242.240.89, production postgres container, db=directus
access: ssh + docker exec -i postgres psql (read-only queries only)
queries_executed:
- information_schema.columns (column metadata)
- pg_indexes (index inventory)
- pg_constraint (constraint inventory)
- SELECT count(*) / COUNT(col) (row stats)
- SELECT col … ORDER BY col (sample values)
queries_NOT_executed:
- any INSERT/UPDATE/DELETE/TRUNCATE
- any CREATE/ALTER/DROP
- any backfill/migration
mutation_performed: NO
§3 — Current Production State (verbatim from inspection)
3.1 public.tac_logical_unit.canonical_address — column metadata
column_name: canonical_address
data_type: text
character_maximum_length: NULL (unlimited)
is_nullable: NO
column_default: (none)
ordinal_position: 2 (immediately after id; first-class field)
3.2 public.tac_logical_unit — full column list (current production)
| # | column | type | nullable |
|---|---|---|---|
| 1 | id |
uuid | NO |
| 2 | canonical_address |
text | NO |
| 3 | doc_code |
text | NO |
| 4 | parent_id |
uuid | YES |
| 5 | sort_order |
integer | NO |
| 6 | section_type |
text | NO |
| 7 | section_code |
text | YES |
| 8 | owner |
text | NO |
| 9 | identity_profile |
jsonb | NO |
| 10 | tier |
text | YES |
| 11 | lifecycle_status |
text | NO |
| 12 | created_at |
timestamptz | NO |
| 13 | updated_at |
timestamptz | NO |
3.3 Indexes on public.tac_logical_unit
tac_logical_unit_pkey UNIQUE btree (id)
tac_logical_unit_canonical_address_key UNIQUE btree (canonical_address)
idx_tac_lu_doc_code btree (doc_code)
idx_tac_lu_identity_profile_gin GIN (identity_profile)
idx_tac_lu_lifecycle btree (lifecycle_status)
idx_tac_lu_parent (partial) btree (parent_id) WHERE parent_id IS NOT NULL
idx_tac_lu_section_type btree (section_type)
3.4 Constraints on public.tac_logical_unit
tac_logical_unit_pkey PRIMARY KEY (id)
tac_logical_unit_canonical_address_key UNIQUE (canonical_address) ← global uniqueness ALREADY enforced
tac_logical_unit_parent_id_fkey FK (parent_id) → tac_logical_unit(id)
tac_logical_unit_section_type_fkey FK (section_type) → tac_section_type_vocab(code)
tac_logical_unit_lifecycle_status_fkey FK (lifecycle_status) → tac_lu_lifecycle_vocab(code)
tac_logical_unit_sort_order_check CHECK (sort_order >= 0)
3.5 Row statistics
row_total: 86
canonical_address_non_null: 86
canonical_address_null: 0
distinct_canonical_address_values: 86 (100% unique; matches DB unique constraint)
lifecycle_distribution_sampled: all 86 sampled rows show lifecycle_status='draft_only'
3.6 Sample values (86 rows in production — full inventory)
Pattern observed: D{doc_num}-DIEU{N}-{S{seg}|ROOT}[-P{N}][-{N}]
D38-DIEU28-ROOT, D38-DIEU28-S0, D38-DIEU28-S1, D38-DIEU28-S1-P1, D38-DIEU28-S1-P2,
D38-DIEU28-S2, D38-DIEU28-S2-P1..P5, D38-DIEU28-S3, D38-DIEU28-S3-P1..P5,
D38-DIEU28-S4..S11, D38-DIEU28-S8-P1, D38-DIEU28-S8-P2, D38-DIEU28-S10, D38-DIEU28-S11,
D38-DIEU32-ROOT, D38-DIEU32-S0..S9, D38-DIEU32-S2-P1..P4, D38-DIEU32-S3-P1..P5, D38-DIEU32-S4-P1..P3,
D38-DIEU35-ROOT, D38-DIEU35-S0..S15, D38-DIEU35-S4-P1, D38-DIEU35-S4-P1-1..P1-3,
D38-DIEU35-S4-P2..P4, D38-DIEU35-S6-P1..P7, D38-DIEU35-S8-P1..P5
Document scopes present: D38-DIEU28, D38-DIEU32, D38-DIEU35 only (Đ44 itself is NOT in tac_logical_unit yet).
3.7 Sister columns named canonical_address (system-wide)
public.tac_logical_unit text NOT NULL (86 rows; the row this whole reconciliation is about)
public.information_unit text NOT NULL (98 rows non-null)
public.unit_edit_draft text NOT NULL (13 rows non-null)
public.iu_notification_event text NOT NULL (column count not measured separately; NOT NULL)
public.event_outbox text NOT NULL (44,726 rows non-null)
public.event_pending text NOT NULL (0 rows non-null at inspection time)
public.birth_registry text NULLABLE (0 rows non-null at inspection time)
sandbox_tac.logical_unit text (sandbox schema)
canonical_address is a system-wide vocabulary, not a tac_logical_unit-only field. Any reconciliation impacts 7+ tables, of which event_outbox is the largest (44,726 rows).
3.8 FKs referencing tac_logical_unit
tac_logical_unit (parent_id → id) self-FK (hierarchy)
tac_change_set_member (logical_unit_id → id) data
tac_publication_member (logical_unit_id → id) data
tac_unit_version (logical_unit_id → id) data
All cross-table references use tac_logical_unit.id (uuid PK), not canonical_address as a foreign-key target. The sister-table canonical_address columns are denormalized text mirrors, not FK references.
3.9 Companion fields proposed by v0.1 P0-1 design — production state
| Proposed by P0-1 §4.1 | Present in production today? |
|---|---|
canonical_address |
YES (text, NOT NULL, UNIQUE) |
canonical_address_format_version |
NO |
authority (enacted/draft/runtime) |
NO column; lifecycle_status exists with FK to tac_lu_lifecycle_vocab(code) — observed value draft_only (different vocabulary) |
birth_gate_class |
NO |
address_collision_at |
NO |
superseded_by_unit_id |
NO |
supersedes_unit_id |
NO |
canonical_address_alias table |
NO |
Other columns NOT in the P0-1 design but present in production: doc_code, sort_order, section_code, owner, identity_profile (jsonb), tier. The identity_profile jsonb column (GIN-indexed) may already hold some of the address-related metadata the P0-1 design intended to break out into discrete columns; this is NOT verified in this discovery — leave as v0.2 design question §6.4.
§4 — Does the Existing Column Match v0.1 / P0-1 Intended Design?
match_verdict: NO — substantial mismatch on multiple axes
4.1 Format syntax — DIVERGENT
p0_1_design_syntax_for_law_artifacts: "Đ{N}[ §{path}]" example: "Đ44 §5.3.1"
production_actual_syntax: "D{doc}-DIEU{N}-{S{n}|ROOT}[-P{n}][-{n}]" example: "D38-DIEU28-S2-P1"
differences:
- prefix: production uses "D{doc}-" outer prefix that scopes by source document (e.g., D38 = document 38); P0-1 design has no document-scope prefix
- article marker: production uses "DIEU{N}" (ASCII spelling of "Điều"); P0-1 design uses Vietnamese diacritic "Đ{N}"
- separator: production uses hyphens with token labels (S, P); P0-1 design uses "§" + dot-separated path
- depth notation: production uses positional codes (S2-P1-3); P0-1 design uses dot-path under §
- non-law artifact formats (design D6/D11, code symbols) per P0-1 §6: NO evidence in production data — only law-document addresses currently exist
4.2 Uniqueness scope — DIVERGENT
p0_1_design_recommendation: per (authority, source_revision); application-layer v0.1; PG constraint FUTURE
production_actual: GLOBAL uniqueness; PG constraint ALREADY in place (tac_logical_unit_canonical_address_key)
Production is stricter than the P0-1 design recommended. This is operationally safer but means:
- the design's collision policy §8 (multiple rows per address with different authority) is INCOMPATIBLE with the current DB constraint
- supersession via supersedes/superseded_by FK columns (P0-1 §4.1) requires CURRENT and PRIOR units to have different canonical_address values — cannot exist with same canonical_address simultaneously
4.3 Authority distinction — ABSENT
P0-1 §7 requires an authority text column with values {enacted, draft, runtime} (Đ24 Step 1 ratified vocabulary, cross-law with Đ0-G).
Production has no authority column. The closest existing field is lifecycle_status (FK to tac_lu_lifecycle_vocab(code)), with observed value draft_only. This is a different vocabulary serving a different concern (publication lifecycle vs birth-gate authority).
4.4 Supersession / alias model — ABSENT
P0-1 §4.1 requires superseded_by_unit_id + supersedes_unit_id FKs and §4.2 proposes a canonical_address_alias companion table for rename history. Neither exists in production.
The production model treats canonical_address as immutable and globally unique with no formal supersession or alias mechanism wired to the schema.
4.5 Format-version mechanism — ABSENT
P0-1 §4.1 introduces canonical_address_format_version for forward-compat migration of the format syntax. Production has no such column. The observed format (D38-DIEU28-S2-P1) is implicit v0; any P0-1 design must either adopt this v0 format as canonical or version-stamp existing rows during reconciliation.
4.6 Sister-table coupling — UNANTICIPATED BY P0-1
P0-1 §3 declared the design scope as augmenting tac_logical_unit only. In production, 7 OTHER tables already store canonical_address as duplicated text (98 + 13 + 0 + 44,726 + 0 + 0 + …). Any change to the canonical syntax, uniqueness rule, or format version would need a coordinated migration across all of them — chiefly event_outbox (44,726 rows).
This is not a backfill problem inside tac_logical_unit (86 rows). It is a system-wide vocabulary problem with 44,000+ rows of coupled data.
§5 — Risks
5.1 Risks if REUSED as-is (no changes to existing column)
R-reuse-1 format_mismatch_with_P0_design
severity: HIGH
description: P0-2 manifest_unit_block, P0-3 cut_change_set, P0-4 verify_result all cite canonical_address per row. If production format ("D38-DIEU28-S2-P1") is used unchanged, the entire P0 design's address-syntax assumption (Đ-prefix + §) must be revised. JSON-schemas, parser logic, citation discipline all anchor on the syntax.
mitigation: revise P0-1/P0-2/P0-4/P0-6 design files to adopt the production syntax as canonical.
R-reuse-2 authority_distinction_unsupported
severity: HIGH
description: P0-design's birth-gate authority (enacted/draft/runtime) cannot be derived from the existing schema. Multiple verification paths in P0-4 depend on the authority being known.
mitigation: add authority column in v0.2 (this is additive and safe); decide how to populate (default 'draft', backfill rule via lifecycle_status or identity_profile.jsonb introspection).
R-reuse-3 global_uniqueness_constraint_incompatible_with_collision_policy
severity: standard
description: P0-1 §7 allows multiple rows with same canonical_address differing by authority; the existing UNIQUE constraint forbids this. P0-1's collision policy §8 needs revision.
mitigation: adopt global-uniqueness policy and drop the multi-authority-same-address proposal; or augment the unique constraint with authority (UNIQUE (canonical_address, authority)) — a constraint change.
R-reuse-4 supersession_model_unimplemented
severity: standard
description: superseded_by_unit_id / supersedes_unit_id / canonical_address_alias do not exist in production. P0-design uses them as the rename pathway.
mitigation: defer to a separate v0.2 sub-item; alias storage decision (table vs JSONB) is P0-1 §9 item 4 — still open.
R-reuse-5 sister_table_drift
severity: HIGH (because of 44,726 event_outbox rows)
description: tac_logical_unit.canonical_address and event_outbox.canonical_address are coupled; if tac changes format, event_outbox rows referencing prior addresses become stale.
mitigation: any format change must be planned with a coordinated cross-table migration; reuse-as-is avoids this risk entirely.
5.2 Risks if ALTERED (e.g., change type/constraint/format on existing column)
R-alter-1 reader_writer_impact
severity: HIGH
description: public.fn_event_unread, the existing INSERT trigger that tests `COALESCE(NEW.canonical_address, '') = ''`, and any other code reading/writing this column on tac_logical_unit, information_unit, event_outbox, etc. would all need verification under the new shape.
mitigation: inventory all callers; v0.2 design must read the trigger and function source; do not alter without that inventory.
R-alter-2 large_table_rewrite
severity: HIGH
description: event_outbox has 44,726 rows; rewriting/migrating canonical_address across that table is a potentially long-running operation.
mitigation: scope alteration to tac_logical_unit only; or stage migration with a backfill window.
R-alter-3 unique_constraint_change
severity: standard
description: changing the unique constraint scope (e.g., to (canonical_address, authority)) requires DROP/REBUILD of the unique index; readers see a brief window without the unique guarantee. Concurrent inserts during the window could create duplicates.
mitigation: do constraint change inside a single transaction with explicit lock; or use CREATE UNIQUE INDEX CONCURRENTLY then swap, then DROP.
R-alter-4 application_code_in_flight
severity: standard
description: Nuxt, dot, agent-data application code reads canonical_address; deploying a schema change without a coordinated app-version handshake creates a window of incompatibility.
mitigation: any altering reconciliation must couple a schema change to an app deploy/handshake — out of scope of v0.2 P0 work; belongs to v0.3+.
5.3 Risks if IGNORED (proceed with v0.2 without reconciling)
R-ignore-1 double_truth_emerges
severity: HIGH
description: if v0.2 adds new address-related columns (e.g., canonical_address_v2, address_authority) without addressing existing canonical_address, the system ends up with two address sources of truth — readers must choose, citations diverge, retrieval is non-deterministic.
mitigation: do NOT add new address-named columns until reconciliation is decided.
R-ignore-2 manifest_envelope_unit_block_design_blocked
severity: HIGH
description: P0-2 (manifest) references canonical_address per row. Without a decision on which address shape v0.2 binds to, P0-2 cannot be designed correctly.
mitigation: reconciliation must complete BEFORE P0-2 dry-run authoring.
R-ignore-3 verify_round_trip_ambiguous
severity: HIGH
description: P0-4 axis-1 round-trip ordering uses canonical_address; ambiguous address vocabulary breaks the verify guarantee.
mitigation: reconciliation must complete BEFORE P0-4 design refresh.
R-ignore-4 governance_signal_misleading
severity: standard
description: decision_backlog entries, change-set payloads, and verify_result findings would cite an address shape that is not the canonical one used elsewhere in the system.
mitigation: address shape must be a single decided form before any CUT/VERIFY work begins.
§6 — Findings Carried Forward
F-DISC-1 production format syntax (D{doc}-DIEU{N}-{S|ROOT}[-P{N}][-{N}]) differs fundamentally from P0-1 design syntax (Đ{N}[ §{path}]); v0.2 must reconcile syntax before P0-2/P0-4 design refresh.
F-DISC-2 canonical_address already enforced text NOT NULL UNIQUE at DB level — stricter than P0-1 design proposed; P0-1 collision policy §8 must be revised.
F-DISC-3 authority column does not exist; lifecycle_status (FK vocab) serves a different concern; v0.2 must decide authority introduction strategy.
F-DISC-4 canonical_address is system-wide vocabulary spanning 7 production tables including event_outbox (44,726 rows); reconciliation scope is far larger than P0-1 imagined.
F-DISC-5 no supersession / no alias / no format_version columns present; P0-1 design treated them as additive — they remain additive, but must be specified in coordination with the existing globally-unique column.
F-DISC-6 identity_profile jsonb column on tac_logical_unit (GIN-indexed) may already store some address-related metadata; UNVERIFIED in this discovery; v0.2 design must inspect.
F-DISC-7 all 86 rows currently lifecycle_status='draft_only'; no published units exist; reconciliation may proceed under low-traffic conditions but app reader/writer audit still required.
F-DISC-8 no FK uses canonical_address as the target; cross-table references use tac_logical_unit.id (uuid). Address-text fields on sister tables are denormalized mirrors.
§7 — What This Discovery Does NOT Do
no_DDL_written: TRUE
no_SQL_mutation: TRUE
no_ALTER_TABLE: TRUE
no_INSERT_UPDATE_DELETE: TRUE
no_migration: TRUE
no_change_to_tac_logical_unit: TRUE
no_change_to_cutter_governance: TRUE
no_deploy: TRUE
no_CUT_VERIFY: TRUE
no_v0_2_design_advanced: TRUE (this file remains discovery-only; design happens in subsequent docs after option selection + GPT approval)
output_form: read_only_discovery_documentation
§8 — Cross-References
v0_1_P0_1_design_baseline:
knowledge/dev/laws/dieu44-trien-khai/migration-design/dot-iu-cutter-v0.1-p0-1-canonical-address-migration-design-2026-05-15.md
v0_1_production_handoff:
knowledge/dev/laws/dieu44-trien-khai/execution/dot-iu-cutter-v0.1-production-handoff-status-2026-05-15.md
v0_1_p0_schema_planning_package:
knowledge/dev/laws/dieu44-trien-khai/planning/dot-iu-cutter-v0.1-p0-schema-planning-package-2026-05-15.md
v0_2_scope_backlog:
knowledge/dev/laws/dieu44-trien-khai/planning/dot-iu-cutter-v0.2-scope-backlog-2026-05-15.md
v0_2_canonical_address_options:
knowledge/dev/laws/dieu44-trien-khai/v0.2-planning/dot-iu-cutter-v0.2-canonical-address-reconciliation-options-2026-05-15.md
v0_2_canonical_address_report:
knowledge/dev/laws/dieu44-trien-khai/v0.2-planning/dot-iu-cutter-v0.2-canonical-address-reconciliation-report-2026-05-15.md
production_pre_migration_schema_evidence (canonical_address pre-existing):
/opt/incomex/backups/dieu44_exec_2026-05-15/directus_schema_pre_20260515T141429Z.sql sha256 638307fd62d4b1aa087ce7f70f42112c4c6185a2e44d8144a1d859029515668a
End of discovery document.