Hello Everyone,

For those who don't know me, I do a lot of research on cache poisoning, and I've developed the HExHTTP script, among other things.

In this article, I'm going to share a little discovery I made while testing the CPDoS. Maybe others have already noticed or tested it, but I did a bit of research on my own and didn't see anyone else mentioning it (sorry if some of you wanted to keep it to yourselves, haha).

The Discovery

In short, during my various technical tests of cache poisoning denial-of-service (CPDoS) attacks, I began testing different payloads, focusing primarily on unusual or misinterpreted headers.

But in the course of this process, I stumbled upon something unexpected. It wasn't a malformed header or an obvious edge case.

It was something much more common: theAcceptheader.

At one point, I sent a simple request like this:

GET /?cb=toto HTTP/1.1
Host: target.com
Accept: application/json

Instead of returning an HTML page, the server responded in JSON format.

At first glance, this seems normal, it's exactly what the header is supposed to do.

But upon closer inspection, I noticed that the JSON page was being cached, so anyone visiting "?cb=toto" received the page in JSON format.

That's when it started to get interesting.

The Accept header is meant to express a client preference. In practice, however, many modern applications treat it as a decision trigger.

Frameworks like Laravel, Symfony, Express.js, or Django often adapt responses automatically:

  • HTML for browser-like requests
  • JSON for API-like requests

This means a single endpoint can have multiple representations, depending solely on headers.

In practice, this led me to observe patterns like:

  • a standard user-facing HTML page
  • a raw JSON structure when switching Accept

In some cases, it felt like accessing an internal or undocumented API just by tweaking a header.

This alone is already useful for:

  • understanding backend structures
  • identifying differences in logic
  • spotting inconsistencies between frontend and API behavior

But the real issue appears when caching comes into play.

Modern infrastructures heavily rely on caching layers such as Cloudflare, Akamai, or Fastly.

These systems build cache keys based on request attributes, but not always all of them. Here's a simplified version of what can happen:

1. Attacker request

GET /page
Accept: application/json

The server returns JSON, which gets cached.

2. Victim request

GET /page
Accept: text/html

Instead of HTML, the cache serves… the JSON response.

None

The result can be surprisingly disruptive:

  • broken rendering
  • frontend crashes
  • blank pages

And if the cache has a long TTL every user gets affected.

What makes this particularly interesting is that the request is completely valid, no malformed payload is needed and the behavior comes from expected HTTP features.

None
None

I also noticed that in some applications, it wasn't necessarily "JSON" but other values such as 'javascript' or "text/plain" that were accepted.

None
None

Expanding the Approach

After noticing this, I started testing other headers with similar intent:

  • Accept-Language
  • Accept-Encoding
  • Content-Type (even in GET requests)
  • X-Requested-With
  • etc…

Some of these led to subtle but meaningful differences in how responses were generated or cached.

None
None

Root Cause

The core issue comes from a mismatch:

  • the application varies responses based on headers
  • the cache does not always take those headers into account

This misalignment creates an opportunity to poison cached responses.

Mitigations

To prevent this class of issues:

  • include relevant headers in the cache key
  • use proper response headers like:
  • Vary: Accept
  • avoid serving multiple formats from the same endpoint
  • clearly separate frontend and API routes

Conclusion

What I find interesting about this discovery is how simple it is.

It didn't come from a complex exploit chain or a theoretical edge case — just from testing and observing how applications behave.

And it highlights something important:

standard web mechanisms can become powerful attack vectors when different layers don't handle them consistently.