Postgres 18 Vs Redis: The Day My Cache Layer Became Optional

I mass deleted 847 lines of Redis code last Tuesday. My tech lead mass deleted 1,200 lines the next day. Our junior developer mass deleted…

The Concurrent Mind

~8 min read · December 20, 2025 (Updated: December 20, 2025) · Free: No

I mass deleted 847 lines of Redis code last Tuesday. My tech lead mass deleted 1,200 lines the next day. Our junior developer mass deleted 2,100 lines by Friday.

None of us consulted each other.

We all just looked at the same benchmarks and had the same realization: we have been paying for infrastructure we no longer need.

This is not a theoretical blog post about database performance. This is the story of how PostgreSQL 18 mass destroyed every caching assumption I held sacred for the past six years of my career.

The Architecture I Defended For Three Years

Let me show you what our system looked like before everything changed:

┌─────────────────────────────────────────────────────────┐
│                      Application                         │
└─────────────────┬──────────────────────┬────────────────┘
                  │                      │
                  ▼                      ▼
         ┌───────────────┐      ┌───────────────┐
         │    Redis      │      │   Postgres    │
         │   (cache)     │      │   (storage)   │
         │   ~0.1ms      │      │   ~5-15ms     │
         └───────┬───────┘      └───────────────┘
                 │                      ▲
                 └──────────────────────┘
                   (invalidation hell)

I built this architecture. I defended it in design reviews. I wrote documentation explaining why it was the "right" approach. I mass mentored junior developers on proper cache invalidation strategies.

For three years, I mass maintained:

847 lines of caching logic spread across 12 files
A separate managed Redis cluster costing $340 every month
Cache invalidation code that broke at least twice every quarter
Two completely different monitoring dashboards
Two completely different on-call runbooks
One massive headache that I convinced myself was "just part of the job"

I was mass proud of this complexity. It made me feel like a real engineer solving real problems.

Then September 2025 happened.

The Moment PostgreSQL 18 Mass Broke My Brain

PostgreSQL 18 shipped with a feature called Asynchronous I/O. The release notes described it as "a new I/O subsystem that can improve performance of sequential scans, bitmap heap scans, vacuums, and other operations."

That description is technically accurate and emotionally dead.

Let me tell you what it actually means.

Old PostgreSQL worked like a polite librarian. You ask for a book. The librarian walks to the shelf, grabs it, walks back, hands it to you. Then you ask for the next book. The librarian walks to a different shelf, grabs it, walks back, hands it to you. One book at a time. Very orderly. Very slow.

New PostgreSQL works like a librarian who finally discovered they have two hands. You ask for ten books. The librarian grabs all ten at once and dumps them on your desk. Done.

The technical term is io_uring integration with the Linux kernel. The human term is: PostgreSQL finally learned to walk and chew gum at the same time.

The official benchmarks claimed 2–3x improvement on read-heavy workloads.

I did not believe them. Database vendors always exaggerate benchmarks. Every single one. It is practically a job requirement.

So I ran the numbers on our actual production queries.

The Numbers That Mass Changed Everything

Our product dashboard has a query that loads user activity for the past 30 days. Nothing fancy. Just aggregating events by date for a single user. The kind of query that runs a million times a day in our system.

SELECT date, sum(events) 
FROM activity 
WHERE user_id = $1 
AND date > now() - interval '30 days'
GROUP BY date;

Here is what I measured:

PostgreSQL 17 Performance:

Cold cache: 12–18ms
Warm buffer: 4–6ms

PostgreSQL 18 Performance (with AIO enabled):

Cold cache: 6–8ms
Warm buffer: 1.8–2.5ms

I ran these tests four times because I did not trust my own results.

The warm buffer number broke something in my understanding of database performance. Under 3 milliseconds for a query touching 30 days of aggregated data. No Redis involved. No caching layer. Just PostgreSQL doing what PostgreSQL was always supposed to do.

Now here is where the math gets uncomfortable.

Redis was giving us 0.1ms reads on cache hits. That sounds incredible until you factor in reality:

Our cache hit rate was 73 percent
Cache miss penalty: 15ms query plus write to Redis
Serialization overhead for complex objects: 0.3ms
Network round trip to managed Redis: 0.8ms
Cache invalidation bugs: priceless (and frequent)

When I calculated the weighted average latency across hits and misses, our "blazing fast" Redis cache was delivering approximately 4.2ms effective latency.

PostgreSQL 18 was delivering 2.1ms without any caching layer at all.

I sat with that number for a very long time.

The Configuration That Made It Real

Here is exactly what we changed in postgresql.conf:

-- Enable async I/O with io_uring
io_method = 'io_uring'

-- Let Postgres issue concurrent reads
effective_io_concurrency = 200
maintenance_io_concurrency = 50
-- Allocate proper buffer space (we have 64GB RAM)
shared_buffers = 16GB
-- Increase read-ahead combining
io_combine_limit = 512kB

That is it. Five configuration changes. No code modifications. No architectural rewrites. No six-month migration project.

We deployed these settings on a Wednesday. By Thursday, our dashboard latency had dropped by 47 percent. By Friday, three of us were independently deleting Redis caching code.

Nobody coordinated this. We all just looked at the metrics and reached the same conclusion.

The Feature Nobody Talked About: Skip Scans

While everyone was focused on Asynchronous I/O, PostgreSQL 18 quietly shipped another feature that mattered even more for our specific workload: btree skip scans.

We have an orders table with a composite index on (user_id, status, created_at). Standard pattern. Nothing unusual.

We also have queries like this:

SELECT * FROM orders 
WHERE status = 'pending' 
AND created_at > '2025-01-01'
ORDER BY created_at DESC 
LIMIT 50;

Notice anything missing? There is no user_id filter.

In PostgreSQL 17, this query could not efficiently use our composite index. The database would scan the entire table or fall back to a less optimal access path. We solved this by caching the results in Redis with a 60-second TTL.

PostgreSQL 18 can now "skip" through the first column of the index. It performs a series of range scans for each distinct user_id value, effectively using the index even when the leading column is not specified in the WHERE clause.

Our order status queries dropped from 45ms to 8ms.

No cache required. No TTL management. No invalidation logic. Just better database engineering from the PostgreSQL team.

The Real Cost Of Unnecessary Caching

Last month I sat down and calculated what our Redis cache layer actually cost us over three years. Not just the infrastructure cost. The total cost.

Managed Redis Infrastructure: $340 per month multiplied by 36 months equals $12,240

Engineering Time On Cache-Related Bugs: I went through our issue tracker. We had 23 cache-related incidents over three years. Average resolution time was 7 hours. At a conservative $150 per hour billing rate, that comes to $24,150 in engineering time.

Feature Development Slowdown: Every new feature required cache invalidation strategy. Every schema change required cache key updates. Every deployment required cache warming consideration. I estimate this added 15 percent overhead to our development velocity. Over three years, with a team of four engineers, that represents thousands of hours.

Cognitive Load: This one is impossible to quantify. But every engineer on our team held cache invalidation logic in their head during every code review, every architectural decision, every debugging session. That mental overhead has a cost.

Conservative total: over $36,000 for infrastructure that PostgreSQL 18 made obsolete with a kernel feature and five configuration lines.

That is not just a technical lesson. That is a business lesson about the real cost of unnecessary complexity.

What Redis Still Does Better

I want to be very clear about something: I am not saying Redis is dead. I am not saying you should delete all your Redis code tomorrow. I am not writing an obituary for in-memory caching.

What I am saying is this: Redis as a read-through cache for PostgreSQL queries is increasingly optional for many workloads.

Redis still wins decisively in specific scenarios:

Session Storage: When you need guaranteed sub-millisecond response times with zero variance. User sessions need to be instantaneous. PostgreSQL, even with AIO, cannot match Redis on pure key-value lookup speed.

Rate Limiting: Atomic increment operations with automatic expiration. The INCR command with TTL is exactly what Redis was designed for. This is not a caching use case. This is a data structure use case.

Pub/Sub Messaging: Real-time event distribution across multiple consumers. PostgreSQL has NOTIFY, but Redis pub/sub is more mature and scales better for high-throughput messaging scenarios.

Leaderboards And Sorted Sets: Redis ZADD and ZRANGE operations on sorted sets are purpose-built for ranking scenarios. PostgreSQL can do this with window functions, but Redis does it more elegantly.

Distributed Locks: The SETNX pattern for distributed locking is battle-tested and reliable. This is infrastructure, not caching.

We still run Redis in production. We use it for session storage and rate limiting. We just stopped using it as a query cache.

Our architecture now looks like this:

┌─────────────────────────────────────────────────────────┐
│                      Application                         │
└─────────────────┬──────────────────────┬────────────────┘
                  │                      │
                  ▼                      ▼
         ┌───────────────┐      ┌───────────────┐
         │    Redis      │      │  Postgres 18  │
         │  (sessions,   │      │  (everything  │
         │  rate limits) │      │   else)       │
         └───────────────┘      └───────────────┘

Simpler. Cheaper. Fewer incidents. Fewer dashboards. Fewer things to explain to new team members.

The Uncomfortable Question About Your Architecture

Here is what kept me awake the night after I deleted those 847 lines of code:

How many engineering decisions in my codebase exist because "that is how senior engineers do it" rather than "we measured and this is faster"?

I added Redis caching on day one of this project because that is what experienced developers do. We cache by default. We treat the database as slow until proven otherwise. We build complexity because complexity signals sophistication.

For three years, I maintained infrastructure that solved a problem I never actually measured.

The PostgreSQL team spent years building Asynchronous I/O. They rewrote fundamental parts of how the database communicates with storage. They integrated with cutting-edge Linux kernel features. They ran thousands of benchmarks and fixed hundreds of edge cases.

And all of that work made my 847 lines of caching code look like what it always was: premature optimization based on assumptions I never validated.

When This Will Not Work For You

I am not going to pretend this approach works for everyone. Let me be specific about the constraints.

Latency Requirements: If your P99 response time requirement is under 2 milliseconds, PostgreSQL 18 alone might not get you there. For true sub-millisecond guarantees, you still need in-memory solutions.

Traffic Volume: If you are serving millions of requests per second to identical hot keys, Redis is still the right tool. The connection overhead alone would overwhelm PostgreSQL at that scale.

Kernel Version: io_uring support requires Linux kernel 5.10 or newer. If you are running on older infrastructure, you will not get the full benefits of Asynchronous I/O.

Dataset Size: If your working dataset does not fit in RAM and you are I/O bound on spinning disks, async I/O helps but it is not magic. SSDs are still essential for this approach.

Cloud Provider Constraints: Some managed PostgreSQL offerings do not yet support io_uring configuration. Check with your provider before planning a migration.

Be honest with yourself about your actual requirements. Measure your actual latencies. Look at your actual hit rates.

The Question That Matters

Every codebase has decisions that were made for good reasons at the time but have outlived their usefulness. Every architecture has complexity that exists because someone assumed it was necessary rather than proving it was necessary.

PostgreSQL 18 forced me to confront one of those assumptions in my own work. The cache layer I defended in code reviews, documented in wikis, and mass explained to junior developers was solving a problem that stopped existing the moment the PostgreSQL team shipped better I/O handling.

I do not know what assumptions are hiding in your codebase. I do not know which complexity is earning its keep and which is legacy from a different era. I do not know if your Redis cache is essential or optional.

But I know that measuring is the only way to find out.

I mass deleted 847 lines of code last Tuesday because I finally measured.

Maybe it is time you measured too.

#redis #postgresql #cache #programming #technology