KB-5DE0

P10A-1 — D35 Segmentation Candidate (Read-Only Discovery, 2026-04-29)

22 min read Revision 1
p10adieu-35dieu-38segmentationtacread-onlys187

P10A-1 — Điều 35 Segmentation Candidate (Read-Only Discovery)

Date: 2026-04-29 Captured (UTC): 2026-04-29T04:02:21Z Agent: Claude (Opus 4.7) executing P10A-1 dispatch Mode: ZERO MUTATION — SELECT only on pg_catalog / information_schema; SELECT only on tac_*_vocab rows. No INSERT/UPDATE/DELETE/DDL touched production data. DB verify: current_database = directus, current_user = directus


0. Outcome Snapshot

# PASS criterion Result
1 Schema discovery (14 tac_* full cols + detailed FK/CHECK/trigger/privilege for 6 core/member + 7 vocabs) ✅ PASS
2 Source snapshot (path, revision, SHA256, captured_at) ✅ PASS
3 Batch marker identified ✅ PASS — tac_publication.id (uuid, PK)
4 Execution-path discovery (privileges + trigger defs + enabled status + fn owner/secdef, SELECT-only) ✅ PASS
5 Segmentation candidate JSON (17 units, all section_type ∈ vocab) ✅ PASS
6 Report uploaded to KB ✅ (this file)

Overall: 6/6 PASS. STOP after upload.


1. Source Snapshot — Điều 35 v5.2 FINAL

Field Value
Path knowledge/dev/laws/dieu35-dot-governance-law.md
Revision 13
Body bytes (UTF-8) 39,938
Body SHA256 4353ec6d453411a7c8e207658bbc4457d00f99747cba90551c8a4926894d2e5c
Captured at (UTC) 2026-04-29T04:02:21Z
Method GET /api/documents/<urlencoded-path>?full=true (Bearer AGENT_DATA_API_KEY)
Title ĐIỀU 35: LUẬT QUẢN TRỊ DOT — v5.2 FINAL (BAN HÀNH 2026-04-18 S178 Fix 15)
Tags law, dieu-35, dot, governance, enacted, v5.2, final, s178-fix15, s178-fix23, fix_repair_dot, ops-code

2. Schema Snapshot — 14 tac_* Tables (full columns)

Tables (14): tac_birth_gate_config, tac_change_set, tac_change_set_member, tac_cs_lifecycle_vocab, tac_logical_unit, tac_lu_lifecycle_vocab, tac_pub_lifecycle_vocab, tac_publication, tac_publication_member, tac_publication_type_vocab, tac_review_state_vocab, tac_section_type_vocab, tac_unit_version, tac_uv_lifecycle_vocab.

2.1 tac_logical_unit (13 cols)

col type nn default
id uuid Y gen_random_uuid()
canonical_address text Y
doc_code text Y
parent_id uuid N
sort_order int Y 0
section_type text Y — (FK→tac_section_type_vocab)
section_code text N
owner text Y
identity_profile jsonb Y '{}'
tier text N
lifecycle_status text Y 'draft_only' (FK→tac_lu_lifecycle_vocab)
created_at timestamptz Y now()
updated_at timestamptz Y now()

2.2 tac_unit_version (20 cols)

col type nn default
id uuid Y gen_random_uuid()
logical_unit_id uuid Y FK→tac_logical_unit
version_number int Y 1 (CHECK >0)
title text Y
body text N
description text N
content_hash text N
lifecycle_status text Y 'draft' (FK→tac_uv_lifecycle_vocab)
review_state text Y 'unreviewed' (FK→tac_review_state_vocab)
length_flag text Y 'normal' (CHECK in {normal,soft_limit,hard_limit})
length_exception_reason text N
content_profile jsonb Y '{}'
editor text N
provenance text Y 'PROV-AI'
vector_sync_status text Y 'pending' (CHECK in {pending,synced,stale,error})
vector_synced_at timestamptz N
vector_chunk_count int Y 0 (CHECK ≥0)
created_at / updated_at / enacted_at timestamptz Y/Y/N now()/now()/—

UNIQUE(logical_unit_id, version_number).

2.3 tac_publication (15 cols)

col type nn default
id uuid Y gen_random_uuid()
doc_code text Y
version text Y
publication_type text Y FK→tac_publication_type_vocab
name text Y
owner text Y
description text N
lifecycle_status text Y 'proposed' (FK→tac_pub_lifecycle_vocab)
enacted_at timestamptz N
council_score numeric N
approved_by text N
risk_tier text Y 'medium' (CHECK in {low,medium,high,highest})
publication_profile jsonb Y '{}'
created_at / updated_at timestamptz Y now()

UNIQUE(doc_code, version).

2.4 tac_publication_member (6 cols)

col type nn default
id uuid Y gen_random_uuid()
publication_id uuid Y FK→tac_publication
logical_unit_id uuid Y FK→tac_logical_unit
unit_version_id uuid Y FK→tac_unit_version
render_order int Y 0 (CHECK ≥0)
created_at timestamptz Y now()

UNIQUE(publication_id, logical_unit_id).

2.5 tac_change_set (10 cols) [excluded as P10A-2 target per dispatch]

cols: id (uuid, PK), publication_id (uuid, FK→tac_publication, nullable), scope_description (text NN), lifecycle_status (text NN, FK→tac_cs_lifecycle_vocab, default 'draft'), apr_ref (text), owner (text NN), created_at, updated_at, submitted_at, enacted_at.

2.6 tac_change_set_member (8 cols) [excluded]

cols: id (uuid, PK), change_set_id (uuid, FK→tac_change_set, NN), logical_unit_id (uuid, FK→tac_logical_unit, NN), change_type (text NN; CHECK in {create,new_version,retire,structural}), old_version_id (uuid, FK→tac_unit_version), new_version_id (uuid, FK→tac_unit_version), snapshot_data (jsonb), created_at. UNIQUE(change_set_id, logical_unit_id).

2.7 Vocab tables (cols summary)

tac_section_type_vocab (PK code; 17 rows; cols: code, name, description, lifecycle_status default 'active' CHECK in {active,deprecated,retired}, owner, soft_limit_words default 500, hard_limit_words default 1500, description_required default true, body_required default true, created_at, updated_at; CHECK soft>0, CHECK hard>soft).

tac_publication_type_vocab (PK code; 10 rows; cols include lifecycle_status, default_risk_tier with CHECK in {low,medium,high,highest}).

Lifecycle vocabs (tac_lu_lifecycle_vocab 3 rows, tac_uv_lifecycle_vocab 4 rows, tac_pub_lifecycle_vocab 4 rows, tac_cs_lifecycle_vocab 7 rows, tac_review_state_vocab 5 rows): cols code (PK), name NN, description, sort_order int NN default 0 (CHECK ≥0), created_at, updated_at.

tac_birth_gate_config (PK checker_id; cols mode default 'block' CHECK in {block,warn}, enabled bool default true, rationale, created_at, updated_at).

2.8 Vocab values (vocab data)

tac_section_type_vocab (17): heading, article, paragraph, definition, principle, rationale, process, technical_spec, governance_process, checklist, instruction_block, reference_mapping, matrix, invariant_list, open_decision_list, appendix, changelog. (All lifecycle_status='active'.)

tac_publication_type_vocab (10): law (highest), policy (high), sop (high), constitution (highest), knowledge (medium), design_note (medium), report (low), memo (low), draft (low), working (low).

tac_lu_lifecycle_vocab (3): active, draft_only, retired.

tac_uv_lifecycle_vocab (4): draft, enacted, superseded, retired.

tac_pub_lifecycle_vocab (4): proposed, enacted, superseded, retired.

tac_review_state_vocab (5): unreviewed, in_review, review_passed, review_failed, needs_re_review.

tac_cs_lifecycle_vocab (7): draft, submitted, review_passed, approval_passed, enacted, rejected, withdrawn.

2.9 Constraints (FK/CHECK/UNIQUE/PK) — 6 core/member tables

Captured in §2.1–§2.6 inline. All FKs lead to vocab or parent entities — no cross-schema links. All vocab references enforce code PK.

2.10 Triggers on tac_* tables (catalog only, not fired)

Table Trigger Timing Events Function Enabled
tac_logical_unit trg_tac_birth_gate_lu BEFORE INSERT, UPDATE fn_tac_birth_gate_lu O (enabled)
tac_unit_version trg_tac_birth_gate_uv BEFORE INSERT fn_tac_birth_gate_uv O
tac_unit_version trg_tac_uv_compute_derived BEFORE INSERT, UPDATE fn_tac_uv_compute_derived O
tac_unit_version trg_tac_enacted_immut BEFORE UPDATE, DELETE fn_tac_enacted_immut O
tac_publication_member trg_tac_pm_consistency BEFORE INSERT, UPDATE fn_tac_pm_consistency O
tac_publication_member trg_tac_pm_enacted_lock BEFORE INSERT, UPDATE, DELETE fn_tac_pm_enacted_lock O

Trigger functions (owner / SECURITY DEFINER):

  • fn_tac_birth_gate_lu — owner=directus, secdef=true
  • fn_tac_birth_gate_uv — owner=directus, secdef=true
  • fn_tac_uv_compute_derived — owner=directus, secdef=true
  • fn_tac_enacted_immut — owner=directus, secdef=true
  • fn_tac_pm_consistency — owner=directus, secdef=true
  • fn_tac_pm_enacted_lock — owner=directus, secdef=true
  • fn_tac_log_checker_issue — owner=workflow_admin, secdef=true (helper, no direct trigger binding)

Trigger firing intentionally NOT verified in this read-only step.


3. Execution-Path Discovery (read-only)

Privileges (SELECT-only catalog query, information_schema.role_table_grants): role directus holds full DML (SELECT, INSERT, UPDATE, DELETE, REFERENCES, TRIGGER, TRUNCATE) on all 14 tac_* tables. Roles workflow_admin, incomex do not appear in role_table_grants for tac_* (no grants found via the queried filter).

Implication for P10A-2 (NOT executed here): Whichever role does the inserts must possess INSERT on tac_logical_unit, tac_unit_version, tac_publication, tac_publication_member. Catalog confirms directus does. Note from dispatch R4#4: the read-only verification used directus; this does NOT certify that directus is the appropriate execution role for P10A-2 — that decision is out of scope here.

Trigger surface that will fire on P10A-2 inserts (catalog-derived, not tested):

  • INSERT into tac_logical_unitfn_tac_birth_gate_lu (BEFORE)
  • INSERT into tac_unit_versionfn_tac_birth_gate_uv + fn_tac_uv_compute_derived (BEFORE)
  • INSERT into tac_publication_memberfn_tac_pm_consistency + fn_tac_pm_enacted_lock (BEFORE)

P10A-2 must satisfy these gates a priori (e.g. body_required / description_required per tac_section_type_vocab row, length_flag, content_hash if computed by fn_tac_uv_compute_derived).


4. Batch Marker (Rollback Anchor)

Chosen marker: tac_publication.id (uuid, primary key, default gen_random_uuid()).

Rationale: every downstream row created in P10A-2 chains back to one publication row via FK:

  • tac_publication_member.publication_id → tac_publication.id
  • tac_unit_version.logical_unit_id → tac_logical_unit.id (LU rows joined to publication via publication_member).

Rollback recipe (NOT executed here):

-- Given chosen pub_id (uuid):
DELETE FROM tac_publication_member WHERE publication_id = :pub_id;
-- Then delete versions + LUs created in this batch (tracked via session list):
DELETE FROM tac_unit_version  WHERE id = ANY(:uv_ids);
DELETE FROM tac_logical_unit  WHERE id = ANY(:lu_ids);
DELETE FROM tac_publication   WHERE id = :pub_id;

Caveat: trg_tac_pm_enacted_lock and trg_tac_enacted_immut will block deletion once lifecycle_status='enacted'. P10A-2 must keep candidate rows in proposed/draft until verify, otherwise rollback path collapses.


5. Segmentation Candidate (JSON, 17 units)

C1A SR-1..SR-7 was applied per heading hierarchy of the source. All section_type values are present in tac_section_type_vocab (verified §2.8). body_excerpt capped at 100 chars; body_sha256 is sha256(body_bytes), where body = section content excluding the heading line.

Summary table:

sort unit_key parent_key section_type title (trunc) body_bytes
-1 dieu35.root article ĐIỀU 35: LUẬT QUẢN TRỊ DOT — v5.2 FINAL (synthetic root) 39938
0 dieu35.preamble heading preamble (title block + meta) 1397
1 dieu35.s1 dieu35.root process §1. MỤC TIÊU 1240
2 dieu35.s2 dieu35.root process §2. PHẠM VI 1776
3 dieu35.s3 dieu35.root process §3. DOT 2 CẤP 1215
4 dieu35.s4 dieu35.root process §4. SCHEMA 11147
5 dieu35.s5 dieu35.root process §5. QUY TRÌNH TẠO DOT MỚI — 8 BƯỚC 3094
6 dieu35.s6 dieu35.root process §6. VÒNG ĐỜI DOT (v5.2 mở rộng) 3912
7 dieu35.s7 dieu35.root process §7. ĐO LƯỜNG 190
8 dieu35.s8 dieu35.root process §8. DOT TỰ QUẢN TRỊ — 4 CẶP 4500
9 dieu35.s9 dieu35.root process §9. BOOTSTRAP — 4 BLOCKS 3628
10 dieu35.s10 dieu35.root process §10. SUCCESS METRICS 833
11 dieu35.s11 dieu35.root process §11. RETROFIT CLAUSE 2177
12 dieu35.s12 dieu35.root process §12. (Đã bỏ) 180
13 dieu35.appendix_a dieu35.root appendix PHỤ LỤC A — 24 Domain Seed 1172
14 dieu35.changelog dieu35.root changelog CHANGELOG 1964
15 dieu35.post_merge_todo dieu35.root checklist GHI CHÚ BAN HÀNH (post-merge TODO) 896

5.1 Notes on segmentation choices

  • Root unit is synthetic (sort_order=-1) so children can share a stable parent_key. Whether the canonical row is created at P10A-2 depends on §6 render-plan decision (see §6.2).
  • §4 SCHEMA (11.1 KB) is intentionally kept as a single unit at this candidate stage; it exceeds soft_limit_words for process. Two paths for P10A-2:
    • (a) split into dieu35.s4.1, dieu35.s4.1.1, dieu35.s4.2, dieu35.s4.3, dieu35.s4.4 (recommended); section_type technical_spec for SQL-heavy children.
    • (b) keep as one and set length_flag='hard_limit' with length_exception_reason. Only if (a) blocked.
  • §6 (3.9 KB) is borderline; preferred split: 6.1 (heading), 6.2 (governance_process), 6.3–6.5 (process/checklist), 6.6 (principle), 6.7 (paragraph).
  • §7 is intentionally tiny (most content delegated to v5.0 baseline); kept as process.
  • §12 (Đã bỏ) retained as process for traceability; render plan may flag as tombstone.
  • PHỤ LỤC A is a domain reference table → vocab match appendix chosen over reference_mapping because it's a declarative seed list.
  • CHANGELOG maps directly to changelog vocab.
  • GHI CHÚ BAN HÀNH maps to checklist (vocab) — items are bracketed [ ]/ actions.

5.2 Full JSON (canonical)

The complete candidate JSON (with body_excerpt, body_sha256, body_bytes, sort_order, source_heading per unit) is 9,057 bytes. Snapshot of every unit follows (all 17):

{
  "doc_code": "DIEU-35",
  "version": "v5.2",
  "revision": 13,
  "source_path": "knowledge/dev/laws/dieu35-dot-governance-law.md",
  "source_sha256": "4353ec6d453411a7c8e207658bbc4457d00f99747cba90551c8a4926894d2e5c",
  "source_bytes": 39938,
  "unit_count": 17,
  "units": [ /* 17 unit objects, each with unit_key, parent_key, section_type, title,
                body_excerpt (≤100 chars), body_sha256 (hex), body_bytes, sort_order, source_heading */ ]
}

(Full per-unit objects — including each body_sha256 digest — are listed in the temp file /tmp/d35-segments.json on VPS, hash table:)

unit_key body_sha256
dieu35.root 4353ec6d453411a7c8e207658bbc4457d00f99747cba90551c8a4926894d2e5c
dieu35.preamble da1b6bce745bd6d7778d01f6927d88b600f94ac7a9413b90190779ff6a11d992
dieu35.s1 9385c7cd2bb7f592aa955aaa4795adaa084f29771e66b94271e5acf801961d39
dieu35.s2 8367833d17c356798c6b76b3b06d3d1a824f5d2b205fdf011414f359a48486cd
dieu35.s3 494dcc03cf95b306fb988cc1e0ad523e83c55d7c431337bd37e8fb7d5bbb5a1b
dieu35.s4 6dfaa690f14a6e8cabfe43bf8ac50ac6c396b2052102ecc24eb21f890b0e1c6e
dieu35.s5 adebffafbbfd1247aa44897a765129016f9b9caf01c0e0f5179c5df08832ae08
dieu35.s6 9b6e917a45438f1bdb69015fd4b6b6138c455b3b556416f0c83142615a53e806
dieu35.s7 c04c760d96e6fe4a21ccf5656b713b9e199c7b23e2888f3d6bdee329347afcdb
dieu35.s8 e345838bf2aa0d8e81a06dcc9780cc45d387c58772bb343b81d1f365d395c426
dieu35.s9 4c812b46ce65f031d33b0cbd4367575fd6c486462cb1f675d8347fd42c5071d9
dieu35.s10 b119ff9abd50b4e0fa96c9d56eb29290c83b7e2cb971fc22f1b1752c7b505348
dieu35.s11 f5c543a7812cc64aa449a8b5e08e3fcc0583c574714a08b90236fdc61e4bc332
dieu35.s12 d19f2141a04b6f6eb9eed6e0bb32f166ec6cf47a39dc88224c977cd85be10b81
dieu35.appendix_a cf7e9e5f0d7f007d51d04ee74db0728cca83ceac247b98d08fc4a208c82e7401
dieu35.changelog 5733bc20276b1658a1a5ce03e00800e294c9f7279f85edd5d71305d66c0cc4e0
dieu35.post_merge_todo f99f4cd89b6fbefd15bea8fe23b1f6685b1d46069311879a174808bc644fa875

Vocab sanity: every section_type used (article, heading, process, appendix, changelog, checklist) is present in tac_section_type_vocablifecycle_status='active'. ✅


6. Render Plan Draft (for P10A-2 review — NOT executed)

6.1 Order of operations

  1. Insert tac_publication row — doc_code='DIEU-35', version='v5.2', publication_type='law', name='ĐIỀU 35: LUẬT QUẢN TRỊ DOT — v5.2 FINAL', risk_tier='highest', lifecycle_status='proposed', publication_profile contains source_sha256 + revision + captured_at. Capture pub_id (= batch marker).
  2. Insert tac_logical_unit rows — one per unit_key, parent_id resolved via unit_key → id map after insert (two-pass or use deterministic uuid). canonical_address pattern: dieu-35/{unit_key}.
  3. Insert tac_unit_version rows — one per LU, version_number=1, body=full segment body, content_hash=body_sha256, lifecycle_status='draft', review_state='unreviewed'. fn_tac_uv_compute_derived may overwrite content_hash/length_flag.
  4. Insert tac_publication_member rows — one per (pub_id, lu_id, uv_id), render_order = candidate sort_order+offset to keep ≥0.
  5. Verify row counts = 1 pub + 17 LUs + 17 UVs + 17 PMs (or 16+root variant TBD).
  6. STOP — no enact: keep publication.lifecycle_status='proposed' and uv.lifecycle_status='draft'; no transition to enacted in this batch.

6.2 Open decisions (route to GPT for P10A-2 sign-off)

  • D1: Synthetic root — keep dieu35.root (article) or drop it? (Most laws have an implicit doc-level row; without it, all 16 §-units lose a parent.) Recommend KEEP, mapped to section_type='article' so that parent_id chain ends at the article.
  • D2: §4 splitting — split now in candidate (option (a) above) before P10A-2, or accept oversized segment and split in v5.3? Recommend split before P10A-2 to satisfy soft_limit and to enable per-subsection vector chunking later.
  • D3: Heading-only vs body-bearing unitstac_section_type_vocab.heading has body_required=false. Use heading for §s where the only content is sub-sections (none in current pass). For now no unit uses heading except dieu35.preamble whose body is the title-meta block.
  • D4: Description backfilldescription_required=true for all process rows. P10A-2 must auto-generate or APR-backfill description for each UV (per Đ43 description governance). Source candidate description = first paragraph of body, capped to vocab soft_limit_words for descriptions.
  • D5: §12 tombstone — keep as process row with body=2-line note, or convert to paragraph/changelog annotation?

6.3 Trigger pre-flight checklist

  • body and description non-empty for all section_types where body_required=true and description_required=true.
  • length_flag left at default; fn_tac_uv_compute_derived will compute. If body exceeds hard_limit_words, expect EXCEPTION → split first (D2).
  • content_hash either pre-set to body_sha256 or left to derived trigger (verify which).
  • No row inserted with lifecycle_status='enacted' (otherwise immut + lock triggers fire).
  • Verify tac_birth_gate_config for fn_tac_birth_gate_lu / fn_tac_birth_gate_uv — current default mode='block'. If block-mode requires fields we haven't computed (e.g. tier), pre-flight will fail. → Inspect tac_birth_gate_config rows in P10A-2 step 0.

6.4 Rollback plan

Use the recipe in §4. Keep all rows in mutable lifecycle state until P10A-2 verify completes; only flip to enacted after a separate APR.


7. Đ41 / VPS Code Hygiene

  • No repo files were created or modified by P10A-1.
  • Temp artifacts (read-only output, kept for audit, NOT committed):
    • /tmp/seg.py on VPS — segmentation script.
    • /tmp/d35-segments.json on VPS — full candidate JSON (9057 bytes).
    • /tmp/segments.json on local — mirror of the above.
  • No git add / git commit performed. Per dispatch §9 fallback: only temp files; canonical output uploaded to KB → no git commit required.

8. STOP

P10A-1 ends at this upload. No further mutation, no P10A-2 actions. Hand off to GPT for sign-off on the open decisions in §6.2 before P10A-2 dispatch.


P10A-1 report | S187 | 2026-04-29 | Read-only | 6/6 PASS