KB-7ADC

09 — DP7 Partitioning / Retention / Archive

10 min read Revision 1
design-packdieu-45dp7partitioningretentionarchiveanti-bloatdesign-only

09 — DP7 — Partitioning / Retention / Archive

DESIGN-ONLY. Cites Điều 45 §15 (risk), §20.1 transitional clause, Điều 36 NVSZ, Điều 35 audit. No DDL, no DML.


§1. Goal

Plan the lifetime curve for every queue substrate table so that:

  1. The hot tier stays small enough for partial-index reads to be sub-ms.
  2. History is preserved long enough for D9 audit / D30 regression / D31 watchdog (90d minimum on runs, 365d on DLQ).
  3. Staging cleanup at 15d (Điều 36) is preserved.
  4. No table grows unbounded.
  5. Partitioning is introduced only when justified by volume — not pre-optimised.

§2. Current state — volume snapshot

Table Live rows (2026-05-26 re-survey) Daily rate observed Days to 1M Days to 5M
event_outbox 133,778 ~16k/day (system/issue_opened dominated) ~54 ~324
event_read 133,439 ~16k/day ~54 ~324
event_pending 0 spiky n/a n/a
iu_route_dead_letter 0 rare very long very long
iu_route_worker_cursor 1 const never never
iu_sql_event_route 1 rare never never
dot_iu_runtime_lease 0 bursty small steady-state n/a
iu_auto_instantiate_event_log 33 rare very long very long
iu_core.iu_staging_record (per cutting cadence) small n/a n/a
iu_core.iu_staging_payload (3× staging_record) small n/a n/a
dot_iu_command_run (per DOT call) tens/day very long very long
job_queue (proposed) 0 initially small depends on Phase 5+ adoption depends
job_dead_letter (proposed) 0 rare very long very long
queue_heartbeat (proposed) 0 const ~10 rows never never

Reading: event_outbox is the only table on a clear trajectory toward partitioning. Everything else is small-cardinality at the rates observed.


§3. Proposed retention contract

Table Hot retention Cold archive Hard delete Compatibility note
event_outbox 90 days hot (partitioned monthly) 365 days archive none (audit) partition only after ≥ 5M rows
event_read Follows event_outbox partitions follows follows FK preserved per partition
event_pending 7 days hot none (transient by design) yes after worker drains + 7d
iu_route_dead_letter 365 days hot none yes after 365d (resolved only) resolution must be set before delete
job_queue (proposed) terminal states succeeded/cancelled cleaned after 30 days → status cleaned none (FK preserved 90d) yes after 90d preserves audit window
job_dead_letter (proposed) 365 days hot none yes after 365d (resolved only) mirrors iu_route_dead_letter
dot_iu_command_run 90 days hot none yes after 90d (Điều 35 audit baseline) per Điều 35 to confirm
iu_core.iu_staging_record per Điều 36 (15d expire → 30d archive → clean) per Điều 36 per Điều 36 UNTOUCHED — preserves existing law
iu_core.iu_staging_payload same same same
iu_auto_instantiate_event_log 365 days none yes after 365d (idempotency preserved by event re-ratification)
iu_lifecycle_log append-only, never deleted (system law) n/a never per existing law
iu_vector_sync_point reflects qdrant; not transactional n/a never unchanged
queue_heartbeat continuous-mirror; no history n/a rows mutate in place trail goes via heartbeat events

All cleanup operations land as job_kind='retention_sweep_<table>' jobs in job_queue, gated by iu_core.retention_enabled (already live, currently false) plus a new per-table queue.retention.<table>.enabled flag.


§4. Proposed partitioning model (event_outbox)

NON-EXECUTABLE DESIGN SKETCH — DO NOT APPLY

-- Trigger: when row count crosses 5_000_000, the table is converted to partitioned.
-- Migration sketch (separate pack, separate Council ratification):

CREATE TABLE public.event_outbox_partitioned (LIKE event_outbox INCLUDING ALL)
  PARTITION BY RANGE (created_at);

CREATE TABLE public.event_outbox_p_2026_05 PARTITION OF event_outbox_partitioned
  FOR VALUES FROM ('2026-05-01') TO ('2026-06-01');
-- ... one partition per month going forward; backfill partitions for historical data
-- Swap-in via attach-then-rename pattern.

-- event_read FK preservation: same partitioning by (event_id) → (created_at) routing.

Important: this is one of the most invasive changes in the roadmap and lives in a dedicated pack with explicit downtime/online plan — not in Phase 1 of doc 14.

Archive scheme:

NON-EXECUTABLE DESIGN SKETCH — DO NOT APPLY

CREATE SCHEMA archive;
CREATE TABLE archive.event_outbox_yyyymm (LIKE public.event_outbox INCLUDING ALL);

-- Cold-storage job:
function fn_event_outbox_archive_sweep(p_cutoff_month text)
  -- INSERT INTO archive.event_outbox_yyyymm SELECT ... FROM event_outbox WHERE ...;
  -- DELETE FROM event_outbox WHERE created_at < cutoff_month::date;
  -- Emits system/event_outbox_archived event.

Archive schema is for forensics; not normally queried.


§5. Lifecycle / status

Per-row retention status added by sweepers:

  • job_queue.status='cleaned' (terminal; row no longer claimed)
  • After hard-delete window: row removed.
  • iu_route_dead_letter.resolved_at set; after 365d row removed.

§6. Indexes / performance

Pre-partitioning (current state) — no change needed.

Post-partitioning — partition-pruning queries on created_at are O(partition_size) not O(table_size). Indexes per partition:

NON-EXECUTABLE DESIGN SKETCH — DO NOT APPLY (per-partition)

CREATE INDEX ON event_outbox_p_yyyy_mm (event_domain, event_type, event_stream);
CREATE INDEX ON event_outbox_p_yyyy_mm (correlation_id) WHERE correlation_id IS NOT NULL;
CREATE INDEX ON event_outbox_p_yyyy_mm (event_subject_table, event_subject_ref) WHERE event_subject_ref IS NOT NULL;

For job_queue — partial indexes from DP2 are sufficient until succeeded row count > 1M; consider monthly partitioning only then.


§7. Security / governance

  • Retention sweep jobs require workflow_admin.
  • Archive schema GRANT pattern: read-only analytics_reader role; no write.
  • Hard-delete operations are audit-emitted (system/<table>_hard_deleted event).

§8. Rollback / disable

  • Per-table flag: queue.retention.<table>.enabled = false → sweep returns 0 rows.
  • Master flag: iu_core.retention_enabled = false (already live, currently false).
  • Partition rollback: detach partition + rename = O(seconds); no data loss.

§9. Healthcheck / observability

NON-EXECUTABLE DESIGN SKETCH — DO NOT APPLY

view v_queue_retention_pressure AS
  SELECT 'event_outbox'::text AS tbl, count(*) AS rows,
         min(created_at) AS oldest, max(created_at) AS newest
    FROM event_outbox
  UNION ALL
  SELECT 'job_queue', count(*), min(enqueued_at), max(enqueued_at) FROM job_queue
  UNION ALL
  SELECT 'iu_route_dead_letter', count(*), min(first_failed_at), max(first_failed_at)
    FROM iu_route_dead_letter
  UNION ALL
  SELECT 'job_dead_letter', count(*), min(first_failed_at), max(first_failed_at)
    FROM job_dead_letter;

function fn_retention_health() RETURNS jsonb
  -- returns sizes + age + 'partition_recommended':bool for event_outbox

§10. Compatibility with Điều 45 v1.0

Clause Compliance
§20.1 transitional substrate preserved ✅ no schema change to live tables at Phase 1
§20.4 worker silent gap is pre-existing, post-enactment violated ✅ retention sweep itself becomes a heartbeat-emitting executor
§15 risk register ✅ partitioning timed by 5M row threshold (matches survey §6 risk)
§14 NVSZ ✅ archive does not introduce vector path
§10 batch principles ✅ retention sweep uses batch with rollback discipline

§11. Implementation prerequisites

  • DP2 (job_queue) for sweep-as-job pattern.
  • DP3 (lease) so sweeps don't double-fire.
  • DP4 (heartbeat) so sweep executor is observable.
  • A event_outbox row count alert (added to v_queue_retention_pressure) before partitioning is required.

§12. Open questions

# Question Routed to
DP7-Q1 Approve partition threshold at 5M rows (Council from survey §6)? Council
DP7-Q2 Approve job_queue cleaned after 30d, hard-delete after 90d? Council
DP7-Q3 Approve DLQ 365d retention (vs 180d / never)? Council
DP7-Q4 Approve archive schema name archive. (vs cold. / attic.)? Council
DP7-Q5 Should dot_iu_command_run 90d retention be re-confirmed in Điều 35 amendment? Council + D35
DP7-Q6 Should hard-delete emit a system/*_hard_deleted event? Council vocab gate. Council

§13. Self-test

self_test:
  cites_dieu45_section: §15,§20.1,§20.4,§14,§10
  defines_status_lifecycle_compatible_with_§6_7: yes (job_queue 'cleaned' state)
  defines_idempotency_key: n/a (sweep is idempotent by design)
  defines_retry_dlq: defers to DP3 substrate (sweeps are jobs)
  defines_lease: inherits DP3 (sweep jobs lease)
  defines_observability_view: v_queue_retention_pressure
  defines_dot_config_disable_flag: queue.retention.<table>.enabled, iu_core.retention_enabled
  defines_executor_set_compatible_with_§11_5: yes (sweep executor = PG_worker)
  no_vector_in_transient: yes
  signal_not_data: yes (archive holds same rows; no new payload class)
  pg_sot: yes
  rollback_concept: per-table flag + master flag + detach-partition
  no_pg_cron_dependency_phase_1: yes
  no_pg_18_dependency: yes
  no_mutation_authored: yes

DP7 design. No mutation. Authored 2026-05-26 by Claude Opus 4.7 (1M).

Back to Knowledge Hub knowledge/dev/laws/dieu44-trien-khai/v0.6-dieu45-full-queue-orchestration-design-pack/09-DP7-partition-retention-archive.md