05 — PostgreSQL 16.13 vs 18 Assessment for Queue Work
05 — PostgreSQL 16.13 vs 18 Assessment
Date: 2026-05-26 | Scope: Should we upgrade from PG 16.13 to PG 18 now, later, or not for the queue work? Mode: Research only. No upgrade executed. No upgrade plan written.
§1. Live state
version: PostgreSQL 16.13 (Debian 16.13-1.pgdg13+1)
extensions: btree_gist, pgcrypto, plpgsql, postgres_fdw
pg_cron: NOT installed
LISTEN/NOTIFY: in use (kb_vector_sync channel, one daemon)
PG 16.13 was the latest 16.x at the time of last upgrade (16.x is in support until November 2028). PG 18 was released September 2025 per official PostgreSQL release notes.
Sources:
- https://www.postgresql.org/about/news/postgresql-18-released-3142/
- https://www.postgresql.org/docs/release/18.0/
- https://www.postgresql.org/docs/18/release-18.html
§2. PG 18 features relevant to a queue workload
| Feature | Relevance to queue | Net value |
|---|---|---|
Asynchronous I/O (AIO) — concurrent I/O via io_method (worker / io_uring / sync); up to 3× on seq scans, bitmap heap scans, vacuum |
Worker draining event_pending does small targeted reads; AIO biggest win is bulk scan. Queue workload doesn't bulk-scan. |
Low. Nice to have for VACUUM on a high-churn event_outbox. |
uuidv7() — timestamp-ordered UUIDs |
event_outbox.id, event_pending.id are currently gen_random_uuid() (v4 random). Timestamp-ordered IDs improve B-tree locality on (created_at, id) style indexes. |
Medium. Real benefit at >10M-row scale. Not blocking. |
| OLD/NEW in RETURNING for INSERT/UPDATE/DELETE/MERGE | A worker could UPDATE event_pending SET processed_at=now() WHERE id=... RETURNING OLD.error_count, NEW.error_count — useful for atomic retry accounting. |
Medium. Removes the need for SELECT-then-UPDATE-then-SELECT in some worker patterns. |
| Virtual generated columns (default) | Could express severity_color, lane_label on event views without storage. |
Low. Cosmetic. |
Temporal constraints (PRIMARY KEY ... WITHOUT OVERLAPS, FOREIGN KEY ... PERIOD) |
Useful for lease tables (dot_iu_runtime_lease is essentially time-bounded); could let the DB enforce "only one active lease per name". |
Medium. But current CHECK + UNIQUE patterns already cover it. |
| Skip scan on multicol B-tree | Frees indexes from forcing the leading prefix. Could simplify (event_domain, event_stream, created_at DESC) index usage. |
Low. Plan-shape detail. |
Faster pg_upgrade (--jobs, --swap, planner-stats carry-over) |
Reduces downtime if/when we eventually upgrade. | Operational — applies to the upgrade itself, not the workload. |
Missing from official release notes that we care about:
- No new pg_cron-equivalent built-in. Scheduler is still extension-only.
- No LISTEN/NOTIFY changes that affect us.
- No new advisory-lock mechanism.
§3. PG 18 breaking changes / upgrade risks
From the official release notes:
| Change | Impact | Mitigation |
|---|---|---|
| Page checksums enabled by default for new initdb | New clusters get checksums; upgrades preserve current setting unless explicitly flipped. | Use pg_upgrade --no-data-checksums for like-for-like, OR enable checksums (costs CPU + I/O). Decide before upgrading. |
| MD5 password auth deprecated | Future removal. Authentication scheme migration to SCRAM required. | Inventory pg_hba.conf + reset/rehash any md5 roles. Low effort but mandatory. |
| Full-text search collation provider change | Was always libc; now uses cluster default. | Must REINDEX all FTS and pg_trgm indexes after upgrade. We have neither in active use, but verify. |
| Wire protocol 3.2 | First new protocol since 7.4. libpq still defaults to 3.0 — clients must opt-in. | No action unless we choose to opt-in. |
| Extension compatibility | All extensions must be PG18-built. Our extensions: btree_gist, pgcrypto, plpgsql, postgres_fdw. All ship in core / are well-maintained. | Confirm Debian package availability for 18. |
| pg_cron PG 18 compatibility | pg_cron is third-party. Per latest pg_cron releases, PG 18 support exists in pg_cron 1.6+ but not verified on our package source in this survey. | If pg_cron is going to be part of the queue design, confirm pgdg-18 pg_cron package availability before committing. |
§4. Cost / benefit for queue work specifically
| If we upgrade NOW (before queue design) | If we STAY on 16.13 |
|---|---|
| Get uuidv7 for new event_outbox PKs (locality win at scale) | Use gen_random_uuid (v4); no real loss below 10M rows |
| Get OLD/NEW RETURNING for cleaner worker code | Use SELECT-FOR-UPDATE + UPDATE; works fine |
| Get AIO performance ceiling lift | Workload isn't I/O-bound |
| Take on orthogonal upgrade risk (checksums decision, MD5→SCRAM, FTS reindex, protocol 3.2 client matrix) bundled with queue design | None of these touch the design |
| pg_cron PG18 packaging risk if pg_cron is in queue scope | pg_cron PG16 packaging well-known; the install itself is the risk, not the version |
| Major-version upgrade test cycle (full schema dump + restore on staging) | Zero upgrade testing required |
The queue design does not depend on any PG 18-only feature. Every queue invariant can be expressed in PG 16.13 with current substrate. The upgrade is a separate engineering project that should be sequenced after the queue is stable, not bundled with its design.
§5. Answers to the mission's pg_upgrade_questions
current: PostgreSQL_16_13
target_candidate: PostgreSQL_18
should_upgrade_now_for_queue: needs_separate_macro
# No PG 18 feature is required for the queue design.
# Bundling the upgrade adds orthogonal risk (checksums, MD5, FTS, protocol) into a design phase.
# PG 18 readiness deserves its own macro after queue design stabilizes.
reason:
benefits:
- uuidv7 for better B-tree locality on time-ordered IDs (at >10M rows)
- OLD/NEW in RETURNING simplifies worker retry accounting
- AIO improves vacuum/scan throughput on event_outbox over time
- faster pg_upgrade jobs+swap for next major upgrade
- temporal constraints could simplify lease modeling
risks:
- page-checksum default flip — must decide intentionally
- MD5 password deprecation — auth migration required
- FTS / pg_trgm reindex post-upgrade (we have neither active; confirm)
- wire protocol 3.2 client matrix (libpq still defaults to 3.0; opt-in)
- pg_cron PG 18 packaging availability not yet confirmed on our pgdg source
- any extension must be PG18-built (core extensions are; verify postgres_fdw on Debian-18)
blocking_compatibility_questions:
- Is pg_cron 1.6+ available in our Debian pgdg-18 channel?
- Are all production-bound clients (Directus, Hermes, Agent Data daemon) wire-protocol 3.0-only?
- Does Directus 11.x support PG 18? (verify)
- Do we have a fully reproducible staging environment for pg_upgrade dry-run?
safe_next_step: stay_on_16_13_for_design
# Plan a PG 18 upgrade readiness macro as a separate work item,
# to fire AFTER queue design stabilizes and BEFORE any production volume spike.
§6. Recommendation
Stay on PostgreSQL 16.13 for the queue design. Queue invariants are PG-version-agnostic.
When queue design is stable + ratified, run a separate "PG 18 upgrade readiness" macro that:
- Validates pg_cron PG 18 packaging on our Debian source.
- Validates Directus + Hermes + Agent Data daemon compatibility (wire protocol).
- Validates extension compatibility (postgres_fdw, etc.).
- Inventories MD5 → SCRAM conversion candidates.
- Decides page-checksum policy.
- Builds a pg_upgrade dry-run plan with timing + rollback.
- Runs the upgrade in a staging clone first.
That macro is separately governed; it is NOT a prerequisite for the queue design.
Sources: