No alert. No angry Slack thread. No spike in latency.
The graphs moved a little. Then they settled lower than before.
That was the moment I realised something very simple.
Postgres 18 was doing a better job than an entire cache tier that I had spent years defending.
This is the story of how that happened, what changed in Postgres, and when you can safely make the same move without wrecking your p95.
The night Redis screamed louder than users
For years our stack looked like every "serious" backend diagram on the internet.
Clients
|
v
+---------+ +--------+
| API +------->| Redis |
+---------+ +--------+
| |
v v
+---------------+
| PostgreSQL |
+---------------+Read path:
- Check Redis
- If miss, hit Postgres
- Write back to Redis
It felt professional. We used the right buzzwords. We tuned TTLs and eviction policies. We gave talks about "caching strategy."
Then one evening during a sale event, Redis became the bottleneck.
- CPU at 90 percent
- Network at the limit
- Latency spikes that had nothing to do with the database
We scaled the cache tier horizontally.
We tweaked maxmemory and eviction.
The database sat there bored.
That was the first time the architecture felt wrong in my stomach. We were paying money and complexity to slow ourselves down.
What changed with Postgres 18
Until Postgres 18, my default instinct was "protect the database at all costs." Caches were shields.
Postgres 18 flipped that mental model for our workload.
The release brought three things that mattered directly to us:
- A new asynchronous I O subsystem that lets Postgres queue and combine read operations, which delivered up to three times faster sequential and bitmap heap scans in many benchmarks
- Smarter index usage, including skip scans on multicolumn B tree indexes, so more queries actually use the indexes we already paid for with storage and maintenance
- Virtual generated columns, which let us keep some "derived" data inside Postgres without heavy write penalties
Together, that meant one very dangerous idea became realistic.
What if the database was fast enough to serve our hot paths directly, with no Redis in front at all.
The experiment that changed my mind
I did not start by deleting anything. I started with a dark launch.
We picked one of our most painful cached queries.
- Endpoint: customer dashboard
- Pattern: user opens dashboard several times a day
- Old behaviour: fetch profile, last orders, and account flags, then stuff the result in Redis for a few minutes
First, we simplified the read in Postgres 18.
We used a multicolumn index and let skip scans do the heavy work.
CREATE INDEX CONCURRENTLY idx_orders_customer_status_date
ON orders(customer_id, status, created_at DESC);
-- Dashboard query (simplified)
SELECT o.id,
o.total_amount,
o.status,
o.created_at
FROM orders o
WHERE o.customer_id = $1
AND o.status IN ('PAID', 'SHIPPED')
ORDER BY o.created_at DESC
LIMIT 20;Before Postgres 18, the planner often gave us a plan that felt half committed to the index.
With skip scans and the other planner changes in 18, the plan finally matched our intention.
Then we turned on async I O in a staging cluster and pointed a small percentage of traffic directly to Postgres, bypassing Redis.
We ran a simple head to head benchmark.
Workload: 1K concurrent users, 90 percent reads, 10 percent writes
Data: ~15M orders, ~2M customers
p95 latency Cache hit rate Notes
Redis + Postgres 72 ms 91% Occasional cache stampede
Postgres 18 only 54 ms 100% No external cacheThe number that mattered to me was not the average. It was the fact that our p95 was lower without Redis than with it.
Suddenly the question was no longer "Can the database handle it." The question was "Why are we still carrying this cache tier."
The day we drew a simpler diagram
We did not delete Redis in one step. We removed it in slices.
The diagram slowly changed into something that made more sense.
Clients
|
v
+---------+
| API |
+---------+
|
v
+---------------+
| PostgreSQL |
+---------------+Inside Postgres, we used features that older versions did not have:
- Virtual generated columns for a few heavy derived fields so that the read side became a single query instead of "query plus compute plus cache."
- Better JSON support and JSON_TABLE in 18 for flexible reporting on semi structured data without an extra cache backed projection layer.
Example of a virtual generated column that made a critical read cheaper and easier to cache at the database level itself.
CREATE TABLE invoices (
id BIGSERIAL PRIMARY KEY,
customer_id BIGINT NOT NULL,
amount_cents BIGINT NOT NULL,
tax_rate NUMERIC(4,2) NOT NULL,
total_cents BIGINT GENERATED ALWAYS AS (
ROUND(amount_cents * (1 + tax_rate))
)
);
-- Dashboard reads total_cents directly, no extra compute, no external cache.Once we trusted the behaviour under load, we did something that felt scary and good at the same time.
We pointed the dashboard traffic away from Redis in production. We watched the graphs.
- Redis QPS fell, then flatlined for that route
- Database read I O went up, but CPU stayed healthy
- User facing latency improved by a small but real amount
No fire. No pager.
Only one big question. Where else could we do this.
When you should not delete your cache tier
I am not going to pretend that everyone can remove Redis after a version upgrade. There are clear cases where a cache tier still earns its keep.
- Wildly spiky read patterns where even async I O will not save disk enough
- Global counters and write heavy workloads that hammer the same rows
- Cross service caching where the database is not the only source of truth
If you cannot survive a database restart without upsetting the business, a cache layer can still be a valuable safety belt.
The point is not "Redis is dead." The point is "Stop shipping cache tiers by habit when Postgres 18 might already be faster and simpler for your actual traffic."
What this means for you as a backend engineer
The biggest win from this journey was not a few milliseconds of latency.
The biggest win was cognitive.
- One source of truth for data and derived fields
- Fewer moving parts during incidents
- Fewer places where stale data could hide
From a career point of view, the important signal to your manager is simple.
You are not the person who blindly adds Redis because every blog diagram has an orange box in the middle.
You are the person who understands the database well enough to know when it can carry the load alone, and who can prove that with numbers.
Postgres 18 gave us the tools. The decision to delete a cache tier was still a human decision, based on real traffic, real benchmarks, and a willingness to simplify.
That is the kind of decision that gets remembered long after everyone forgets which version of Redis you were running.