Postgres 19 lands its async IO. The pgvector lead has thoughts.
Postgres 19 (released April 25) ships the long-awaited asynchronous IO API. The pgvector and pg_search teams are the first to land integrations. One is good. The other is what the next year is for.
medium.com
In this article
Postgres 19 shipped on April 25 with three features that will be in the conversation for the next year: parallel B-tree builds at scale (the implementation does what its 2023 design proposal promised), the new EXPLAIN ANALYZE buffer-attribution view (the kind of feature that earns ovation from people who have run a hot standby in anger), and the asynchronous IO API. The async IO is the one that has been waiting since the 2018 Tomas Vondra patch.
The first integrations have landed in the extension layer. pgvector's 0.9.0 release uses the new IO API for HNSW prefetch on graph traversal. pg_search's 0.4.0 release uses it for parallel posting-list scans. The pgvector integration is good. The pg_search integration is what next year is for.
pgvector 0.9.0
The HNSW graph-traversal prefetch was always the part of vector search that wanted async IO. On a graph with millions of nodes, the working set during a single query touches a different page on every hop, and the latency of those page reads dominates. The 0.9.0 implementation queues a configurable number of prefetches ahead (default: 16) and the result, in the bench cases, is a roughly 3x improvement on cold-cache HNSW search latency. The benchmark detail that matters: the improvement disappears at vectors-already-in-cache (so it's a tail-latency story, not an average-latency one), and it requires the new effective_io_concurrency_async tunable to be set above the (still low) default.
pg_search 0.4.0
pg_search's implementation is what we would call, charitably, a v0. The parallel posting-list scan has the right shape but the threading model fights with Postgres's background-worker architecture in a way that surfaces under heavy concurrent search load. The benchmark numbers are good (1.8x improvement on a single query); the production numbers are not yet reproduced. The maintainer acknowledged the issue on the github discussion thread the day the 0.4.0 release dropped, with the kind of measured response that makes you trust the team to land the fix.
Both extensions are doing the work the database itself cannot do. That is the right division of labor — and it is the same division of labor Postgres has been working out, in its slow and deliberate way, since the late 1980s. The async IO is here. The integrations are starting. The next year is when the wider extension ecosystem catches up.