Every CVE Unlocks Another One

Most people read a CVE as an ending. A bug was found, a patch shipped, the door is closed, move on.

I read them as the opposite. A published security advisory is one of the most generous documents in software. It names the vulnerable file, explains the root cause in plain language, and shows you the exact fix. And very often, without meaning to, it tells you where to look next.

That is the double edge of a CVE. The same advisory that protects users is a map for whoever reads it next, and that reader can be a defender, a researcher, or an attacker. The information does not care which one you are.

I wrote earlier about two SSRF CVEs I found because two different projects shared one root cause. This post is the technique one level up: finding new vulnerabilities by mining the patches for old ones. The case study is a humble download manager called pyLoad, which handed me two CVEs for the price of reading about ten others: CVE-2026-42313 and CVE-2026-42312.

How I Found My First Two SSRF CVEs From zero security experience to two high-severity findings using Claude Code

In this post:

Why a wall of CVEs is a signal, not a warning sign
The method: patch archaeology in three questions
The one pyLoad function that got patched, and patched, and patched
The two bugs the other fixes left in the margins
How these two changed the way I pick targets
Where the method runs out
Why patch archaeology keeps working
The real fix: root cause analysis

CVEs cluster, and that is a signal

A question I like to ask people who pick software by its security record: a project has zero published CVEs. Is it safe, or has nobody looked?

Usually it is the second one. A CVE is not evidence that a project is bad. It is evidence that somebody competent spent real time staring at it. Untouched code has zero CVEs for the same reason an unexplored cave has no map.

The useful part is that CVEs cluster. Once a project is on a researcher's radar, more people pile in and the count climbs fast.

pyLoad is a textbook cluster: an open-source download manager, the kind of self-hosted tool that runs on a home server or a small team's box with more than one user account. In 2026 it collected north of a dozen CVEs, from host header spoofing to a reconnect-script RCE to SSRF to command injection. A wall of them is not a "stay away" sign. It is an invitation.

The method: patch archaeology

When I open a cluster, I am not reading the advisories to learn what was fixed. I am reading them to learn how the maintainer thinks, and where their attention stops. Each advisory answers three things, and I turn each into a question of my own.

What was the actual root cause? Not the symptom. "SSRF in the download handler" is a symptom. "They hand-rolled an IP validator instead of using a library" is a cause. Causes repeat.
Did the fix address the class, or just this one instance? A fix that says "we validated this entire layer" closes a class. A fix that says "we added widget_x to the blocklist" closes one instance, and tells you a blocklist exists.
What siblings did the patch miss? If the bug was a missing check on parameter A, go find B, C, and D that reach the same dangerous place. The maintainer fixed the one a reporter handed them. The rest are still sitting there.

This is not my invention. Security people call it variant analysis: using one known bug as a seed to find its siblings. In a writeup for Semgrep, Eugene Lim splits it into three shapes worth keeping in your head: a variant (the same flawed pattern living somewhere else in the code), an insufficient patch (a fix that closed the reporter's exact case but not the root cause), and a regression (a later change quietly stripping out a mitigation so an old bug comes back).

pyLoad was the middle one, an insufficient patch, five times over. My SSRF post was the first, the same pattern showing up in a second codebase. The third is the one that will eventually undo pyLoad's structural fix if nobody is watching.

The idea behind all three is the same: the cheapest bug to find is the one next to a bug somebody already found.

The way I actually run it is to take the root cause from the advisory, write a Semgrep rule that matches that exact pattern, then loosen the rule step by step to catch the variants the original wording does not. Semgrep floods you with hits, so I hand the results to Claude Code to triage, dropping the false positives and flagging the few that smell like the original bug.

Call it patch archaeology: reading the dig notes of the person before you and noticing the corner they never excavated.

The one pyLoad function that would not stay fixed

First, the permission model, because the bug abuses it. pyLoad is multi-user. Accounts have roles, and roles carry permission bits. One of those bits is SETTINGS: the right to change the program's configuration.

In a multi-user deployment you hand SETTINGS to a non-admin so they can manage download and UI options without being a full admin. That is the whole point of the role.

The function that writes config, set_config_value(), only requires the SETTINGS bit. The catch is that some settings are far more dangerous than UI options: where downloads get written, what script runs on reconnect, which TLS certs the web UI uses.

So pyLoad adds a second gate, a hand-typed list called ADMIN_ONLY_CORE_OPTIONS. If an option is on the list, you need admin. If it is not, any SETTINGS user can change it. That list is the only thing standing between a non-admin and admin-level control of the config.

ADMIN_ONLY_CORE_OPTIONS = {
    ("general", "storage_folder"),
    ("log", "syslog_host"), ("log", "syslog_port"),
    ("proxy", "password"), ("proxy", "username"),
    ("reconnect", "script"),
    ("webui", "host"),
    ("webui", "ssl_certfile"), ("webui", "ssl_keyfile"), ("webui", "ssl_certchain"),
    ("webui", "use_ssl"),
}

ADMIN_ONLY_CORE_OPTIONS = {
    ("general", "storage_folder"),
    ("log", "syslog_host"), ("log", "syslog_port"),
    ("proxy", "password"), ("proxy", "username"),
    ("reconnect", "script"),
    ("webui", "host"),
    ("webui", "ssl_certfile"), ("webui", "ssl_keyfile"), ("webui", "ssl_certchain"),
    ("webui", "use_ssl"),
}

Read that twice, because the whole cluster lives in it. The security of this function depends on a human remembering to add every dangerous option to a list, by hand, forever. They did not.

The CVE history reads like a slow-motion confession, one fix adding storage_folder, another the reconnect script, another the SSL cert options, each one the same sentence with a different noun: "we forgot to put option X on the list". Nobody asked the obvious question. If you keep forgetting options, the list is the bug.

That gap is what I went looking in.

The two CVEs hiding in the margins

I went through every option that reaches something dangerous and checked it against the list. Two were not there. In both, the attacker is an ordinary SETTINGS user with no admin rights, doing nothing but changing settings they are technically allowed to touch.

CVE-2026-42313: proxy hijack (High, 8.3)

The list protected proxy.username and proxy.password, but left proxy.enabled, proxy.host, proxy.port, and proxy.type open. So a SETTINGS user can point those four at a server they own and flip proxying on.

From that moment every outbound request pyLoad makes (downloads, update checks, captcha fetches) tunnels through the attacker, who reads the URLs, cookies, and auth tokens inside them and can rewrite the responses, feeding pyLoad a poisoned download or a malicious "update".

The detail I love: gating only the proxy credentials is worse than useless. They exist so an admin can authenticate to a trusted corporate proxy. But when the attacker chooses the proxy, the attacker is the proxy, and does not need pyLoad's password to reach their own server.

The fix guarded the lock and left the door off its hinges. Security people call this a confused deputy. I call it gating the receipt and ignoring the cash.

You can read the full breakdown on the official NVD Advisory page for CVE-2026–42313.

CVE-2026-42312: TLS verification bypass (Medium)

The option ssl_verify was also missing. The same SETTINGS user can switch it off, and pyLoad stops checking TLS certificates entirely, accepting a forged cert for any site it fetches.

On its own that needs an attacker already sitting on the network path between pyLoad and its targets. But chain it with the proxy hijack and the attacker becomes that path: turn on the attacker's proxy, turn off cert verification, and a full man-in-the-middle now works against HTTPS too, forged certs waved through for every hostname. Two settings a non-admin should never have reached, combined into total interception of encrypted traffic.

Same function, same root cause as the four before me, same fix shape. I did not find a clever new bug class. I found the eleventh and twelfth entries in a list the project had been extending one CVE at a time, and I found them by reading the first ten.

The real fix is structural: stop maintaining the list by hand and move an admin_only flag into the config schema, so every option declares its own sensitivity where it is defined. You cannot leave an option off a list that does not exist.

You can read the full breakdown on the official NVD Advisory page for CVE-2026–42312.

How these two changed the way I pick targets

For my first CVEs I did the obvious thing: pick an interesting tool, read its code, look for a soft spot. It works, but it is slow and it is a lottery. pyLoad changed two things.

First, I stopped starting from a repo and started from a patch. A project that already has CVEs has done three favors for me before I arrive: it proved the code has the kind of bug I hunt, it showed me how the maintainer thinks about fixes, and it handed me root causes to find variants of. A patched repo is a repo with a map in the lobby.

Second, a project with CVEs is a project where someone will actually answer. This sounds cynical and it is simply true. The hardest part of disclosure on the long tail is not finding the bug, it is getting a human to respond.

I have sent careful reports with working PoCs into the void on random repos and gotten nothing back: no advisory, no reply, no CVE. A project that already publishes advisories has a maintainer who picks up the phone, a Security tab that is wired up, and a process that ends in a credited CVE. The existing advisories are a signal about the bug and about the person.

pyLoad is the within-a-repo version of this: one project, one function, a string of fixes, and the variants they missed. The same logic works across repos, which is what my SSRF post was, one root cause found first in one project and then in a second that wrote almost the identical broken function. Either way you start from the same place: a bug somebody already found, written down in public.

Where this method runs out

I oversold it for most of this post, so here is the honest other side.

It is derivative by design. You only ever find variants of bugs someone already found, never a new class. The genuinely novel work still takes fresh eyes on untouched code. This is the cleanup crew, not the explorer.

It is a race, and the map is public. The advisory that pointed me at set_config_value() points everyone else there too. Clusters get mined out: once someone does the full variant pass, or the maintainer ships a structural fix, the easy siblings are gone, and a late report comes back marked duplicate.

It rewards shallow bugs. The variants you can spot in a diff are the pattern-matchable ones, a missing entry or one more filter bypass. Deep logic flaws and multi-step chains leave no clean signature in a patch, so the method walks right past them.

It needs a trail, and good projects hide it. Plenty of advisories say nothing useful ("fixed a security issue in 1.2.3"), and a project that fixed its first bug structurally leaves no cluster to mine. Worse, some maintainers bury the real fix inside a large refactor commit or keep the advisory deliberately vague, exactly to deny variant and n-day hunters the map. The better-run the project, the less this works.

Why patch archaeology keeps working

pyLoad is not an outlier. It is just unusually honest, because the bug and the fix are each a single line in a list, so you can watch the pattern play out in the open. Most projects patch the same way. You just cannot see it as clearly.

When a maintainer gets a security report, the incentive is to ship the smallest change that stops this one reporter's case from reproducing, then close the issue. It is faster, easier to review, and makes the problem go away today. It also leaves every sibling of the bug exactly where it was, and publishes an advisory pointing at the neighborhood.

As Lim points out, a researcher is usually chasing one working exploit, not cataloguing every variant, so they report the one that fires and move on. The siblings get abandoned at both ends, by a finder who was not looking for them and a maintainer who did not widen the fix.

So incomplete fixes are routine, the predictable output of how security bugs get patched. The CVE feed is a standing supply of half-closed bugs.

And this is about to get worse, because we are starting to let AI do the patching. Hand a model an advisory and tell it to fix the bug, and it is very good at exactly the wrong thing: it reads the one failing case, writes the one line that makes it stop, and closes the issue. It rarely steps back to ask whether the other proxy options, or the next config flag, share the same flaw. It was asked to fix a bug, not to redesign an allowlist.

Worse, the patch looks thorough. It arrives with a confident explanation and a passing test, which is exactly what gets a tired maintainer to hit merge, and an incomplete fix that reads like a complete one is more dangerous because nobody goes back to check it. AI is great at the narrow, pattern-shaped work of matching a known bug, and weak at the design judgment of killing its whole class, so autofix industrializes the half-fix: more advisories, shipped faster, each one a fresh map. The supply patch archaeology feeds on does not shrink. It grows.

The five maintainers who patched set_config_value() before me each made it a little safer, and each left the next variant sitting there for the sixth person to read about. I was the sixth.

The real fix: root cause analysis

So before you close your next security bug, ask the uncomfortable question. Did I fix the class, or did I just add an entry to a list?

The answer that actually works is root cause analysis, and the practical version is simple. Treat the report as one symptom, find the underlying pattern, then go looking for every other place that pattern lives in your code before you ship.

That is the variant analysis a researcher runs against you, turned inward. If the pyLoad maintainers had done it after the first missed option, they would have stopped adding nouns to the list and asked why the list is hand-maintained at all, which is the question that leads straight to the structural fix from earlier. Run the variant pass on your own patch and there is nothing left in the margins for me to find.

Here is the honest limit. Root cause analysis only protects the repo it runs in. It kills in-repo variants and insufficient patches, because one team owns all the code and can sweep it. It does nothing for the cross-repo case. When the same root cause lives in a different project, the hand-rolled IP filter from my SSRF post, no single maintainer can see it, because nobody owns the whole ecosystem. That class only closes when the root cause becomes shared knowledge: a writeup, a Semgrep rule people actually run, a library that replaces the thing everyone keeps getting wrong. Until then, cross-repo variants are nobody's job, which is why they keep becoming somebody's CVE.

There is a level in between, though, and it is the one I spend my days on. Between a single repo and the whole open ecosystem sits your organization, where someone does own all the code. Full disclosure, this is one of the problems we are working on at Apiiro: treating a finding as a pattern to chase across an organization, not a single bug to close. The open-source long tail stays a free-for-all. Your own org does not have to be.

When you see a project with a wall of CVEs, how do you read it: "avoid this", or "someone is finally looking, and there is more in there"? I have come around hard to the second one.

Contents