Bassam Ismail
Engineering

Postgres Is the Only Backend You Need

9 min read

The bug I did not want was a task row committed in Postgres and a worker job that never made it to Redis. So when our agent first needed background work, I made the enqueue happen inside the same Postgres transaction that created the task. That instinct held up. We used the Postgres we already had, turning it into a postgres job queue, and after running it in anger I would make the same call again. This is why, and where the edges are.

The decision rule I now use is explicit: choose Postgres when the job is a durable side effect of existing relational state. Choose a broker when throughput, fanout, delayed delivery semantics, or independent scaling dominate. Everything below follows from that.

TL;DR

A postgres job queue works well when the workload is modest, transactional, and already lives next to Postgres data. Workers claim tasks with SELECT ... FOR UPDATE SKIP LOCKED, so concurrent workers avoid the same row without a separate broker. Enqueue, retries, crash recovery, scheduling, and observability all stay inside the database the app already runs.

The reflex, and why I skipped it

A broker is the default because a queue feels like a different kind of problem than a database. But adding one means a second system to deploy, monitor, secure, and keep consistent with your database. The classic failure is the dual-write: you commit a row in Postgres and then enqueue a job in Redis, and if the process dies between the two, you have a committed change with no job, or a job for a change that rolled back.

This runtime needs Postgres for projects, repos, context, and audit. So the question flips from "which broker" to "is Postgres actually good enough as the queue?" For this workload (agent tasks measured in seconds-to-minutes, a handful of concurrent workers, not millions of messages a second) the answer is yes, and you get properties a separate broker makes you work for.

Claiming a task in a postgres job queue

The claim path is one query. A worker claims the next eligible task inside a transaction:

const picked = await db.transaction(async (tx) => {
  await tx.execute(sql`SET LOCAL statement_timeout = '30s'`);
  const candidate = await tx.execute<{ id: string }>(sql`
    SELECT id FROM tasks
     WHERE status = 'pending'
       AND attempts < max_attempts
     ORDER BY priority DESC, created_at ASC
     LIMIT 1
     FOR UPDATE SKIP LOCKED
  `);
  const row = candidate.rows?.[0];
  if (!row) return null;
  // UPDATE the row to 'running', stamp worker_id + started_at, return it
});

The load-bearing words are FOR UPDATE SKIP LOCKED. FOR UPDATE locks the row this worker is about to claim. SKIP LOCKED tells every other worker's query to ignore rows that are already locked rather than blocking on them. Ten workers can run this exact query simultaneously and each gets a different row, with no coordination and no central dispatcher. PostgreSQL documents both row-locking behavior for SELECT ... FOR UPDATE and the SKIP LOCKED option in its SELECT reference.

One important caveat on the concurrency guarantee: SKIP LOCKED prevents two workers from claiming the same row at the same moment, but it does not make execution exactly-once. A worker that crashes after claiming but before completing leaves the row in running. Recovery re-queues it and another worker will execute it again. The real model is at-least-once execution, which means task handlers need to be idempotent.

The rest earns its place too. ORDER BY priority DESC, created_at ASC makes it a priority queue with FIFO within a priority. attempts < max_attempts bakes retry-budgeting into the claim itself, so an exhausted task is not eligible for re-claim. SET LOCAL statement_timeout = '30s' means a pathological claim cannot wedge a worker forever; it dies and the loop tries again.

A TASK'S LIFEenqueuetransactionalpendingclaimFOR UPDATE SKIP LOCKEDrunningworker_id + started_atdone / failedor attempts++ → pending[ one query does the claim; no broker ]

The database mechanics that make this work

The claim query above is only fast and safe when the table and indexes are shaped correctly. The details that actually bit me in practice:

Table shape. The tasks table needs at minimum: id, status (pending, running, done, failed), priority, created_at, attempts, max_attempts, worker_id, and started_at. The worker_id and started_at columns are how recovery identifies orphaned rows.

Partial index for the claim query. Without an index, every claim is a sequential scan over all tasks. A partial index covering only claimable rows keeps the claim path fast as the table grows:

CREATE INDEX tasks_claim_idx
    ON tasks (priority DESC, created_at ASC)
 WHERE status = 'pending'
   AND attempts < max_attempts;

Postgres can use this index to satisfy the WHERE clause and the ORDER BY in a single pass, turning the claim from a table scan into an index scan over a small working set.

Detecting orphaned rows. A running row is orphaned when its worker is gone. The recovery query uses started_at and worker_id: any row in running whose started_at is older than a reasonable timeout and whose worker_id is not in the set of currently active workers gets reset to pending with attempts incremented. The timeout threshold needs to be longer than your longest legitimate task, or you will re-queue work that is still in flight.

Advisory lock: transaction vs. session scope. For the scheduler gate I use pg_try_advisory_lock (session-scoped) rather than pg_try_advisory_xact_lock (transaction-scoped). Session-scoped locks persist for the life of the connection, which is what you want when the scheduler holds the lock across multiple enqueue transactions. If you use the transaction-scoped variant, the lock releases at the end of the first transaction and a second scheduler instance can slip in immediately after.

Autovacuum churn. Every claim cycle touches rows: pending to running to done or failed. Each of those updates leaves dead tuples. On a busy queue table, autovacuum runs frequently and the table bloats if it cannot keep up. In practice I saw this first with roughly a thousand tasks per hour across four workers: the default autovacuum_vacuum_scale_factor of 0.2 was too conservative and the tasks table crept up steadily until I dropped it to 0.01 on that table specifically. Two mitigations work together: partition the table by status so the done partition can be truncated or dropped rather than vacuumed, and tune autovacuum_vacuum_scale_factor downward on the tasks table to trigger vacuums earlier. This is the most common operational surprise with Postgres queues at moderate volume.

Starvation and fairness. ORDER BY priority DESC, created_at ASC means high-priority tasks are claimed before lower-priority ones. If high-priority tasks arrive faster than workers can drain them, low-priority tasks starve. For this workload that is acceptable, but if you need fairness across priority bands, you need a weighted-random claim or separate queues per priority level.

The properties you get

Because the queue is rows in the same database as the rest of the write, several things follow directly:

  • Transactional enqueue. Creating a task and the data it refers to commit together. There is no window where one exists without the other, so a task cannot be lost or orphaned at birth by a process crash between two separate writes.
  • Safe concurrency. SKIP LOCKED handles concurrent claim queries at the database level. Add workers and they self-distribute, subject to the at-least-once caveat above.
  • Crash recovery. If a worker dies mid-task, its row is stuck in running. On startup the process resets orphaned running rows back to pending and they get re-claimed. A crash costs a retry, not a lost task, provided handlers are idempotent.
  • Observability. "What is the queue doing?" is a SELECT. Depth, oldest pending, failure counts, what each worker is on right now: all ordinary SQL against a table you can already see, with no separate dashboard required.

That same bias toward keeping the moving parts visible shows up elsewhere in the stack, like Running background jobs on macOS with launchd and Building Press, Part 5: A static export that reads a database.

The scheduler, and not enqueuing twice

Recurring work (refreshing each project's context on a cadence) is a scheduler that wakes up and enqueues jobs. The hazard with more than one instance running is double-enqueue: two schedulers both decide it is time to refresh project X and both write a task. The fix is a Postgres advisory lock: before enqueuing a cycle, a scheduler tries to take a named advisory lock, and only the holder enqueues. The other instance's attempt fails fast and does nothing. No external coordinator, no distributed lock. PostgreSQL's pg_advisory_lock functions are the primitive behind that scheduler gate.

Where this stops being a good idea

This is not a general endorsement of Postgres-as-everything. SKIP LOCKED queues work well up to a point. At very high throughput the single table becomes a contention point, the constant claim-churn pressures autovacuum faster than the mitigations above can cover, and you will eventually want partitioning or a purpose-built broker. The honest boundary: this works because the workload is low-volume, latency-tolerant agent tasks sharing a database that has to exist anyway. Pushing tens of thousands of messages a second means you should buy the broker.

For a Slack agent running a few tasks a minute, the postgres job queue is one fewer moving part to operate, and the database you already had turns out to be sufficient.

FAQ

What does FOR UPDATE SKIP LOCKED actually do?

FOR UPDATE row-locks the task a worker is claiming; SKIP LOCKED makes every other worker's claim query ignore already-locked rows instead of waiting. Together they let many workers run the identical claim query at once and each take a different row, with no shared dispatcher.

How are tasks recovered if a worker crashes mid-run?

A crashed worker leaves its row stuck in running. On startup the process resets orphaned running rows (no live worker) back to pending so they get re-claimed. The cost of a crash is one retry, not a lost task.

Why advisory locks for the scheduler?

To stop two scheduler instances from both enqueuing the same recurring job. A scheduler takes a named Postgres advisory lock before a cycle; only the holder enqueues, the other no-ops. It's leader election without a separate coordination service.

When would Postgres stop being enough as a queue?

At high throughput. A single SKIP LOCKED table contends and stresses autovacuum under heavy churn, so tens-of-thousands-per-second workloads want partitioning or a real broker. For seconds-to-minutes agent tasks at low volume, it's comfortably sufficient.