dot-iu-cutter v0.2 — BR-2 identity_profile JSONB Discovery (2026-05-15)
dot-iu-cutter v0.2 — BR-2 identity_profile JSONB Discovery
document_path: knowledge/dev/laws/dieu44-trien-khai/v0.2-planning/dot-iu-cutter-v0.2-br-2-identity-profile-jsonb-discovery-2026-05-15.md
revision: r1
date: 2026-05-15
author: Agent (Claude Code CLI, Opus 4.7 1M)
phase: v0.2 planning — BR-2 read-only discovery
mutation_performed: false
§1 — Purpose
Resolve BR-2 (reconciliation report §4): inspect public.tac_logical_unit.identity_profile (jsonb, GIN-indexed) to determine whether the v0.1 P0-1 design's companion vocabulary (authority, canonical_address_format_version, etc.) already lives inside the jsonb. The output is purely a finding: is there a risk of double-storage if Phase α adds explicit columns?
§2 — Read-Only Method
queries_executed (all SELECT-only):
- DISTINCT jsonb_object_keys
- frequency per key
- value_type per key (jsonb_typeof)
- ILIKE filter for authority-like / format-version-like / address-like keys
- sample shapes (LIMIT 5 with jsonb_pretty)
no_mutation: TRUE
no_DDL: TRUE
§3 — Distinct Top-Level Keys
total_distinct_top_level_keys: 3
keys:
body_sha256
canonical_address
source_span
3.1 Frequency per key
body_sha256 → 27 rows (out of 86)
canonical_address → 27 rows
source_span → 27 rows
Implication: only 27 of 86 rows have a non-trivial identity_profile. The remaining 59 rows have identity_profile = '{}' (column is NOT NULL so cannot be NULL; must be empty object). All three keys appear together — no row has 1 or 2 of the 3 keys, all-or-nothing.
3.2 Value type per key
body_sha256 → string (27 rows)
canonical_address → string (27 rows)
source_span → object (27 rows; shape: {start_line:int, end_line:int})
§4 — Targeted Key Searches
4.1 Authority-like keys
ILIKE filters tested: '%authority%', '%enacted%', '%draft%', '%runtime%', '%birth%', '%gate%'
keys_found: NONE (0 rows)
4.2 Format-version-like keys
ILIKE filters tested: '%version%', '%format%', '%schema_v%'
keys_found: NONE (0 rows)
4.3 Address-like keys
ILIKE filters tested: '%address%', '%canonical%', '%path%', '%citation%', '%dieu%', '%section%'
keys_found:
canonical_address → 27 rows
Only canonical_address matches — and it duplicates the column value.
§5 — Sample Shapes (verbatim)
// row id 09e5a5a5-… canonical_address column = "D38-DIEU28-ROOT"
{
"body_sha256": "e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855",
"source_span": {
"end_line": 1,
"start_line": 1
},
"canonical_address": "D38-DIEU28-ROOT"
}
// row id 2bab00a6-… canonical_address column = "D38-DIEU28-S0"
{
"body_sha256": "87c2850b9bd87854abd5bb9d57576f7fb01cb123fb9bc94888bfa76945e32704",
"source_span": {
"end_line": 8,
"start_line": 2
},
"canonical_address": "D38-DIEU28-S0"
}
Observation: body_sha256 value e3b0c4…b7852b855 is the SHA-256 of the empty string (well-known constant). Several "ROOT" or section-header rows store this sentinel, indicating the row's body content is empty/heading-only.
§6 — Does Authority-Like Data Already Exist?
authority_already_stored_in_identity_profile: NO
authority_already_stored_elsewhere_on_tac_logical_unit: NO
nearest_existing_concept:
- lifecycle_status (text, FK to tac_lu_lifecycle_vocab)
- currently all 86 rows show value 'draft_only'
- SEMANTICALLY DISTINCT from P0-1 authority enum {enacted, draft, runtime}
- lifecycle_status governs publication lifecycle; authority governs Đ0-G birth-gate distinction
v0_2_phase_alpha_implication:
adding an explicit `authority` column to tac_logical_unit is SAFE from double-storage risk
(nothing to reconcile against existing jsonb; only thing to coordinate with is lifecycle_status — and they govern different concerns)
§7 — Does Format-Version-Like Data Already Exist?
format_version_already_stored_in_identity_profile: NO
format_version_already_stored_elsewhere_on_tac_logical_unit: NO
canonical_address_format_implied_by_data: D{doc}-DIEU{N}-{S|ROOT}[-P{n}][-{n}] (consistent across all 86 production rows)
v0_2_phase_alpha_implication:
adding an explicit `canonical_address_format_version` column to tac_logical_unit is SAFE from double-storage risk
(no existing field carries this concept)
§8 — Does canonical_address-Like Data Already Exist Inside the JSONB?
canonical_address_already_stored_in_identity_profile: YES (duplicating the column value)
extent: 27 of 86 rows (the 27 rows whose identity_profile is non-empty)
match: 100% of those 27 cases — the jsonb value equals the column value verbatim
why_duplicated: unclear from inspection alone; likely an artifact of how those 27 rows were ingested (an importer that wrote both the column AND the jsonb mirror)
risk_of_double_storage: LOW — it is the SAME value, not a divergent value
followup_recommendation: cleanup is not blocking for v0.2; can be normalized in a separate cosmetic pass later (drop the jsonb key once readers are confirmed to use only the column)
§9 — Other Observations
body_sha256:
- 27 rows hold a sha256 string of the row's body content
- several rows hold the well-known empty-string sha256 (e3b0c4…)
- this is unrelated to canonical_address but is useful birth-gate evidence; the Phase β supersession design may want to leverage body_sha256 for content-change detection
source_span:
- 27 rows hold {start_line, end_line} int pairs pointing into a source document
- unrelated to canonical_address itself but useful for P0-2 manifest_unit_block design (source_span_start / source_span_end fields are already part of the v0.1 P0-2 design)
- v0.2 manifest_unit_block design CAN read from this source_span jsonb sub-object (if a per-row mirror is wanted) OR re-derive at MARK time
§10 — Recommendation for Phase α
recommendation_for_v0_2_phase_alpha:
add_authority_column:
safe: YES
type: text
nullable: YES initially (backfill required for 86 existing rows)
constraint: CHECK or FK to a Đ24 vocabulary table (TBD; see BR-4)
backfill_rule: derive from lifecycle_status or doc_code; design under BR-4
add_canonical_address_format_version_column:
safe: YES
type: text (semver string)
nullable: NO with DEFAULT 'd38-v0' (or whatever Đ24 ratifies under BR-5)
backfill_for_existing_rows: trivial — UPDATE ALL with the chosen default constant
add_canonical_address_alias_table:
safe: YES
placement: tac schema (per P0-1 §3 recommendation) OR cutter_governance (per X-1 placement decision; revisit)
no_conflict_with_jsonb: confirmed — no alias-related data lives in identity_profile
do_NOT_add_columns_that_would_duplicate_jsonb_data:
body_sha256_as_column: NO — already covered by jsonb; if Phase β needs a column, defer that decision
source_span_as_columns: NO at the tac_logical_unit level (already in jsonb); P0-2 manifest_unit_block has its own source_span_start/source_span_end fields — those are on manifest_unit_block, not tac_logical_unit
cosmetic_cleanup_NOT_blocking:
consider dropping `canonical_address` key from identity_profile in a separate cosmetic pass after confirming no reader depends on it; NOT part of Phase α
br_2_blocker_status: RESOLVED
followup_for_br_4: authority column type/vocabulary is TBD — depends on BR-4 (Đ0-G authority backfill rule design)
followup_for_br_5: canonical_address_format_version DEFAULT value depends on BR-5 (Đ24 ratification of production syntax as v1)
§11 — Hard Boundaries
no_DDL_written: TRUE
no_SQL_mutation: TRUE
no_ALTER_TABLE: TRUE
no_INSERT_UPDATE_DELETE: TRUE
no_migration: TRUE
no_design_authored: TRUE (only findings + recommendation; Phase α DDL design happens later under explicit prompt)
output_form: br_2_read_only_discovery
End of BR-2 discovery.