KB-1A59

23-P3D4C0 — New Piece + Document Batch Notification Design

18 min read Revision 1
p3d4c0designnotificationbirthbatchdebouncepg-nativehot-path

23-P3D4C0 — New Piece + Document Batch Notification Design

Date: 2026-05-08 Prompt: knowledge/dev/laws/dieu44-trien-khai/prompts/23-p3d4c0-new-piece-document-batch-notification-design-prompt.md (rev2) Scope: Design / read-only inventory only. NO implementation. NO mutation. Author: Claude Opus 4.7 (1M)


Executive Summary

Birth notifications (new IU / new piece) are NOT emitted by the current P3D2 runtime — fn_iu_notif_version skips version_seq <= 1, and there is no birth notification trigger. This birth_notification_gap is CONFIRMED.

The schema currently has no stable source_document or import_batch identifier. parent_or_container_ref exists on information_unit but is unused (0/12 rows populated). canonical_address is the only stable text key. Therefore, document-rollup grouping cannot be reliably done today without minimally extending the schema.

pg_cron extension is NOT installed on the Postgres container. Any PG-native debounce architecture must include pg_cron install as a prerequisite, OR fall back to a manual systemd timer (rejected — would be an external scheduler, violating boundaries).

P3D2 trigger hot-path review = PASS. All three notification trigger functions are O(1) single INSERT (with at most one PK-indexed sub-SELECT for canonical_address resolution). No COUNT, JOIN, GROUP BY, or aggregation. No heavy work found.

Recommendation: DESIGN_STAGING_OUTBOX_FIRST — build a 3-layer staging→worker→durable architecture. Stable grouping requires schema addition (source_document_ref/import_batch_ref) — design only in P3D4C0; implementation belongs to P3D4C1.

Directus exposure status: PAUSE_UNTIL_BIRTH_BATCH_DESIGN — board claims of completeness must be deferred until birth/batch is implemented or the gap is explicitly disclosed in the surfaced view.


A. PG Inventory Evidence — 8 Design Questions

Q1. Birth identification

Authoritative tables/columns:

  • information_unit.id — IU identity (uuid, gen_random_uuid())
  • information_unit.created_at, created_by — birth timestamp + actor
  • unit_version.version_seq — unique constraint (unit_id, version_seq); version_seq=1 is the birth row
  • Birth triggers on information_unit: trg_birth_information_unit, trg_iu_birth_gate_layer1 (BEFORE INSERT), trg_iu_birth_gate_layer2 (AFTER INSERT/UPDATE DEFERRABLE)
  • 12 IUs in DB, all have unit_version.version_seq=1 row (one per IU) → birth always materializes UV seq=1

unit_version does have created_by (Variant A confirmed by P3D2 report).

Q2. Source document / import batch identifier

information_unit columns (full list):

id, canonical_address, unit_kind, lifecycle_status, content_anchor_ref,
version_anchor_ref, owner_ref, parent_or_container_ref, conformance_status,
identity_profile (jsonb), created_at, updated_at, created_by, updated_by,
deleted_at, sort_order
  • No source_document_id, import_batch_id, import_job_id, parent_document_id.
  • parent_or_container_ref exists but is NULL for all 12 IUs in DB.
  • identity_profile JSONB samples contain only {title, owner_lookup_ref, primary_section_type_ref} — no batch/source key.
  • canonical_address is unique text (e.g. pilot.p3.p1.20260506-070623.cbb29036) — convention-based, not a stable foreign key.

Verdict:

  • source_document_identifier=NOT_FOUND
  • import_batch_identifier=NOT_FOUND

Recommendation: Add minimal nullable columns to information_unit:

  • source_document_ref text (nullable; opaque token from import job, e.g., URI/hash/job key)
  • import_batch_ref text (nullable; one batch may produce many docs → pieces)

Both populated by the creation gateway (fn_iu_save / future ingestion path) when known. Absent → piece-level event with gap disclosed.

Q3. Three scenarios — discrimination logic

Scenario Discriminator Detected by worker
1 piece added to existing document source_document_ref exists & matches an existing IU set; OR no source_ref but unit_kind indicates piece-of-document piece-level new_piece_created
Many pieces from one new document, simultaneously shared source_document_ref and/or import_batch_ref within debounce window; count ≥ threshold document-level document_imported rollup
Many unrelated new pieces distinct source_document_ref (or all NULL) one piece-level event each

Without stable key, scenarios 2 and 3 are indistinguishable — that is the current state.

Q4. Event taxonomy proposal

Existing (P3D2): comment_added, draft_created, version_applied.

Add (P3D4C1+):

  • new_piece_created — single new IU not part of a multi-piece document (or piece added later to existing doc)
  • document_imported — rollup event when N≥threshold pieces share source_document_ref within debounce window

Defer / out-of-scope here (later packs may add): document_sliced_created, explicit birth_announced. P3D4C0 keeps surface minimal.

recommended_event_taxonomy=new_piece_created,document_imported,comment_added,draft_created,version_applied

Q5. Debounce mechanism (PG-native)

Evaluation:

  • pg_cron + staging table — PREFERRED. Append-only staging table on hot path; pg_cron job runs every 90–120s, reads pending, groups by stable key, inserts durable events, marks pending processed.
    • Prerequisite: pg_cron extension is not currently installed. P3D4C1 must add CREATE EXTENSION pg_cron (superuser) and configure cron.database_name=directus.
  • LISTEN/NOTIFY — optional supplement only (signal worker to wake up sooner). Not the main mechanism in Phase 1 — would require a long-running listener process (new tool / service), violating "no new tool".
  • Advisory locks — used inside the worker to ensure only one cron tick runs at a time (pg_try_advisory_lock). Not a debounce mechanism on its own.

debounce_mechanism_recommended=PG_CRON_STAGING

Debounce window stored in a config table (e.g., extend existing dot_config with key notification.debounce_seconds), default 90; bounds [60, 300]. debounce_window_configurable=true.

Q6. Anti-spam / grouping rule

  • N pieces sharing stable key within window AND N ≥ notification.batch_piece_threshold (default 2) → 1 document_imported event with payload {piece_count, sample_unit_ids[5]}.
  • 1 piece (or N=1 with stable key, or no stable key) → 1 new_piece_created event.
  • Late-arriving piece for an already-rolled-up document → 1 piece-level new_piece_created with payload referencing the prior rollup (best effort).
  • Creator implicit self-read applies at every level — already enforced by fn_iu_notification_board self-exclusion logic (T12 PASS in P3D2).
  • batch_piece_threshold_configurable=true (config key notification.batch_piece_threshold, default 2, bounds [2, 50]).

recommended_grouping_strategy=HYBRID (piece-level by default; document rollup when stable key + count threshold met).

Q7. New piece hook — least invasive

Options analyzed:

Option Hot-path cost Coverage Bypass risk
AFTER INSERT trigger on information_unit O(1) All IU births Birth-gate fires before; safe
AFTER INSERT trigger on unit_version WHERE version_seq=1 O(1) (already wired) All IU births (seq=1 always exists per evidence) None — UV is the canonical "this exists" marker
Hook into fn_iu_save O(1) but inside function body Only fn_iu_save callers; misses any future direct creation gateway Yes — direct INSERT bypasses
Hook in creation gateway (fn_iu_create*) O(1) Only gateway path; misses re-introduction or migration imports Yes

Recommended: UV_SEQ1_TRIGGER — extend the existing trg_aa_iu_notif_version family with a sibling trigger function (or branch) that, on version_seq=1, appends to the new staging table instead of skipping. Reasons:

  1. Already on the right relation (UV is where version_applied already routes).
  2. Birth row (version_seq=1) is universal evidence (12/12 IUs have it).
  3. Cannot be bypassed by direct information_unit INSERT — the IU is unusable without a UV.
  4. Matches Variant A (unit_version.created_by present, used as actor).

recommended_hook=UV_SEQ1_TRIGGER

Q8. P3D2 trigger hot-path review

Source captured from PG (live):

-- fn_iu_notif_draft (UV draft):
IF NEW.draft_status != 'open' THEN RETURN NEW; END IF;
INSERT INTO iu_notification_event(...)
VALUES('draft_created','review',NEW.unit_id,NEW.canonical_address,
       NEW.id,NEW.created_by,'trg_aa_iu_notif_draft')
ON CONFLICT DO NOTHING;

-- fn_iu_notif_version:
IF NEW.version_seq <= 1 THEN RETURN NEW; END IF;
INSERT INTO iu_notification_event(...)
VALUES('version_applied','update',NEW.unit_id,
       (SELECT canonical_address FROM information_unit WHERE id=NEW.unit_id),
       NEW.id,NEW.created_by,'trg_aa_iu_notif_version')
ON CONFLICT DO NOTHING;

-- fn_iu_notif_comment:
IF NEW.comment_kind='system' OR NEW.author_type='system' THEN RETURN NEW; END IF;
INSERT INTO iu_notification_event(...)
VALUES('comment_added','comment',NEW.unit_id,
       (SELECT canonical_address FROM information_unit WHERE id=NEW.unit_id),
       NEW.id,NEW.author_ref,'trg_aa_iu_notif_comment')
ON CONFLICT DO NOTHING;

Per-trigger analysis:

  • 1 INSERT into a single, well-indexed table (iu_notification_event).
  • fn_iu_notif_version and fn_iu_notif_comment issue 1 single-row, PK-indexed sub-SELECT to fetch canonical_address. This is O(1) (B-tree primary key lookup, ~µs).
  • No COUNT / SUM / GROUP BY / aggregation. No multi-row JOIN. No JSON build.
  • ON CONFLICT DO NOTHING against partial unique index uq_notif_event_type_ref.

p3d2_trigger_hot_path_review=PASS p3d2_trigger_heavy_work_found=false hot_path_joins=0 (single-row PK sub-select, not a join) hot_path_aggregations=0 hot_path_expected_complexity=O(1)

Optional minor optimisation (not in scope): pass canonical_address via a dedicated parameter to drop the sub-SELECT — defer to future pack only if perf telemetry indicates need.


B. Hot Path Architecture (3 Layers)

[Layer 1 — Staging / Hot path — O(1) only]
unit_version INSERT (seq=1)        →  trg → INSERT iu_notification_pending(...)
unit_version INSERT (seq>=2)       →  trg → existing event INSERT (no change)
unit_edit_draft INSERT             →  trg → existing event INSERT (no change)
unit_edit_comment INSERT           →  trg → existing event INSERT (no change)

[Layer 2 — Worker — pg_cron, every 90-120s, advisory-locked]
SELECT pending rows (FOR UPDATE SKIP LOCKED)
GROUP BY COALESCE(source_document_ref, import_batch_ref, NULL)
For each group:
  IF group has stable key AND count >= threshold:
    INSERT 1 document_imported event
  ELSE:
    INSERT N new_piece_created events
DELETE / mark processed pending rows

[Layer 3 — Durable / Read]
iu_notification_event (existing, P3D2)
iu_notification_read  (existing, P3D2)
v_iu_notification_board (Directus exposure — DEFERRED)
Layer Table Hot path? Computation here?
Staging iu_notification_pending (NEW, design only) YES NONE — append-only
Durable iu_notification_event (existing) NO Worker writes
Read view v_iu_notification_board (future) NO Read-time only

Staging table proposed shape (design only, no DDL applied):

iu_notification_pending (
  id            uuid PRIMARY KEY DEFAULT gen_random_uuid(),
  event_kind    text NOT NULL,        -- 'piece_birth' (only kind in Phase 1)
  unit_id       uuid NOT NULL,
  canonical_address text NOT NULL,
  source_document_ref text NULL,
  import_batch_ref text NULL,
  actor_ref     text NOT NULL,
  ref_id        uuid NOT NULL,        -- the unit_version id (seq=1)
  created_at    timestamptz NOT NULL DEFAULT now(),
  processed_at  timestamptz NULL
)
INDEX (processed_at) WHERE processed_at IS NULL,
INDEX (source_document_ref) WHERE source_document_ref IS NOT NULL

C. Debounce + Grouping Mechanism

  • Window: notification.debounce_seconds (default 90).
  • Cadence: pg_cron */2 * * * * (every 2 min) — exceeds window slightly to prevent missed batches.
  • Stable group key: COALESCE(source_document_ref, import_batch_ref) (text). Rows with NULL key never group.
  • Threshold: notification.batch_piece_threshold (default 2) per group within window.
  • Worker idempotence: relies on uq_notif_event_type_ref partial unique index + ON CONFLICT DO NOTHING; for rollup events, ref_id = first piece UV id of the group (or hash of sorted UV ids).
  • Observability counters (Layer 2 worker logs): pending_count_pre, pending_count_post, groups_emitted, pieces_emitted, duration_ms, error_count. Stored in iu_notification_worker_log (design only).

D. Event Taxonomy Proposal

Add to chk_notif_event_type and chk_notif_event_type_stream:

event_type event_stream Source Notes
new_piece_created update worker NEW
document_imported update worker NEW (rollup)
comment_added comment trigger existing
draft_created review trigger existing
version_applied update trigger existing

Stream mapping uses the existing update stream — no new stream is needed (keeps board filter logic stable).


E. Hook Recommendation

recommended_hook=UV_SEQ1_TRIGGER

Concrete shape (DESIGN ONLY, do not apply):

CREATE FUNCTION fn_iu_notif_birth() RETURNS trigger ... AS $$
BEGIN
  IF NEW.version_seq <> 1 THEN RETURN NEW; END IF;
  INSERT INTO iu_notification_pending
    (event_kind, unit_id, canonical_address,
     source_document_ref, import_batch_ref,
     actor_ref, ref_id)
  SELECT 'piece_birth', NEW.unit_id, iu.canonical_address,
         iu.source_document_ref,    -- pending column add
         iu.import_batch_ref,        -- pending column add
         NEW.created_by, NEW.id
  FROM information_unit iu WHERE iu.id = NEW.unit_id;
  RETURN NEW;
END $$;

CREATE TRIGGER trg_aa_iu_notif_birth AFTER INSERT ON unit_version
FOR EACH ROW WHEN (NEW.version_seq = 1)
EXECUTE FUNCTION fn_iu_notif_birth();

This is O(1) (single-row PK lookup + single INSERT), matches existing pattern, and respects creator implicit self-read at the board layer (no change to read policy).

creator_implicit_self_read_applies=true — already implemented by fn_iu_notification_board; piece/rollup events inherit it because creator = actor_ref.


F. P3D2 Trigger Review

Detailed analysis above (Q8). Summary:

  • All 3 functions are SECURITY DEFINER, search_path pinned, owner=directus, ACL revoked from PUBLIC.
  • Hot path is single INSERT + (optional) single PK sub-SELECT.
  • No mutations recommended in P3D4C0.
  • Future opt: pass canonical_address from caller to drop sub-SELECT (deferred).

G. Future Observability Design

To be implemented by P3D4C1+ (NOT in this pack):

  • iu_notification_worker_log(run_at, pending_pre, pending_post, groups_emitted, pieces_emitted, duration_ms, error_count, error_text).
  • Quiet period: cron tick exits early when pending count = 0 AND time since last emission > notification.quiet_period_seconds (default 300).
  • Rollback: worker is fully recoverable — pending rows untouched on error; ON CONFLICT DO NOTHING idempotent on event side.
  • Health check: a Directus admin board view summarising last 24h of worker logs.

H. Recommendation

DESIGN_STAGING_OUTBOX_FIRST

Justification:

  • Stable key is missing; piece-only or timing-only grouping would be wrong.
  • Birth gap is real and confirmed (P3D2 explicitly skips seq≤1).
  • Schema addition is minimal (2 nullable text columns + 1 staging table + 1 trigger + 1 cron job + 2 config keys).
  • pg_cron install is a one-time prerequisite — must be sequenced as Step 0 of the next pack.
  • All work fits within hot-path O(1) discipline.

Rejected alternatives:

  • EXTEND_RUNTIME_BEFORE_DIRECTUS_EXPOSURE: would require stable key already to exist — it doesn't.
  • EXPOSE_CURRENT_RUNTIME_WITH_GAP_DISCLOSED: acceptable as a temporary fallback but loses the "user-facing board complete" claim and creates UX expectation mismatch (newly imported documents would silently not notify).
  • DEFER_BIRTH_BATCH_UNTIL_SOURCE_SCHEMA_READY: source schema (Điều 38 publication tables) is outside Điều 44's runtime concern; we should not block on it. Add minimal source/batch ref columns directly on information_unit instead.

I. Directus Exposure Status

p3d4c_directus_exposure_status=PAUSE_UNTIL_BIRTH_BATCH_DESIGN

The user-facing notification board cannot be claimed complete while birth + batch grouping are missing. Directus collection exposure (P3D4C original direction) is paused until P3D4C1 implements staging+worker, OR until the user explicitly accepts gap-disclosed exposure (would then change status to ALLOW_GAP_AWARE_CURRENT_RUNTIME).


Next Required Pack

P3D4C1_STAGING_OUTBOX_AND_WORKER_IMPLEMENTATION

Scope:

  1. Step 0 — install pg_cron extension (admin task).
  2. Add nullable source_document_ref, import_batch_ref columns on information_unit.
  3. Create iu_notification_pending staging table + indexes.
  4. Create fn_iu_notif_birth + trg_aa_iu_notif_birth (UV seq=1).
  5. Extend chk_notif_event_type* to include new_piece_created, document_imported.
  6. Worker function fn_iu_notification_worker_tick() + pg_cron schedule.
  7. Two config keys in dot_config: notification.debounce_seconds, notification.batch_piece_threshold.
  8. Pilot + clean cleanup; preserves P3D2 + P3D1 hashes.
  9. Followed by P3D4C2 — Directus board exposure (resumes the paused work).

P3D4C0 design only | Hot path O(1) | Staging+pg_cron architecture | birth gap CONFIRMED | source/batch NOT_FOUND → schema add proposed | pg_cron prerequisite flagged | NO mutation

Back to Knowledge Hub knowledge/dev/laws/dieu44-trien-khai/design/23-p3d4c0-new-piece-document-batch-notification-design.md