Your Software Has a Supply Chain — Most Teams Have No Idea How Long

The xz-utils backdoor, SolarWinds, Log4Shell, and event-stream all rhyme. The defenses are simpler than the news cycle suggests.

On March 29, 2024, a Microsoft engineer named Andres Freund noticed that SSH logins on his Debian system were 500 milliseconds slower than they should be. He could have ignored it. Instead, he traced the latency through liblzma, through the xz-utils compression library, and into a backdoor that had been planted by a maintainer who spent two and a half years earning the project's trust before inserting it.

CVE-2024–3094 was rated CVSS 10. It targeted OpenSSH. The attacker, operating under the name "Jia Tan," had been a contributor since October 2021, escalating to co-maintainer status through sock-puppet accounts that pressured the original maintainer with feature requests and stability complaints. The backdoor shipped in xz-utils 5.6.0 in February 2024 and was activated in 5.6.1 in March. Freund caught it within days of activation. If he hadn't, it would have rolled into Debian Stable, Ubuntu LTS, and Fedora — every major Linux distribution.

This is the high-water mark of software supply chain attacks. It is not the first, and it will not be the last.

The Pattern

Every major software supply chain incident in the last decade has fit one of three patterns. Recognizing the pattern is the first step toward defending against it.

Pattern 1: Dependency confusion

The attack: register a public package with the same name as one a target uses from a private or alternate registry. Package managers prefer higher version numbers, so the malicious public version gets installed instead of the private one.

Real incident: In December 2022, an attacker registered a package named torchtriton on PyPI with a higher version number than the one PyTorch shipped from its own nightly index. Anyone who ran pip install pytorch-nightly on Linux between December 25 and December 30 installed the malicious version. The package downloaded a binary that read /etc/passwd, /etc/hosts, the first thousand files in the user's home directory, and the contents of ~/.gitconfig and ~/.ssh/*. It exfiltrated them to an attacker domain. The malicious package was downloaded 2,717 times before the issue was contained — 2,500 of those on December 26 alone.

This is the same class of attack as the 2021 Alex Birsan research where similar techniques compromised internal builds at Apple, Microsoft, PayPal, and 32 other companies through public npm and PyPI registries.

Pattern 2: Compromised maintainer

The attack: take over an existing package that already has trust, then ship malicious code as a routine update.

Real incidents:

event-stream (November 2018). A popular npm package downloaded 8 million times per week. The original maintainer, no longer interested in the project, transferred it to a stranger ("right9ctrl") who quickly added a malicious dependency targeting the Copay bitcoin wallet. The malicious code was running in production for months before disclosure.
xz-utils (CVE-2024–3094, March 2024). Described above. A 2.5-year social engineering campaign, sock-puppet harassment of the original maintainer, gradual privilege escalation, and finally a sophisticated IFUNC-based backdoor hidden in test fixtures and build scripts. The attacker patiently chose a compression library that nearly every Linux distribution depends on because of its transitive linkage to OpenSSH.
tj-actions/changed-files (March 2025). A popular GitHub Actions workflow used by tens of thousands of repositories was compromised. The malicious version dumped CI secrets to public logs. The damage radius was every repo that had used @latest or a mutable tag rather than pinning to a commit SHA.

Pattern 3: Compromised build system

The attack: don't bother with the source code. Compromise the build pipeline so that the binary that ships is different from the source that exists.

Real incident: SolarWinds Orion (December 2020). Nation-state actors compromised the build system of SolarWinds and injected the SUNBURST malware into the Orion network management product as part of legitimate signed builds. Roughly 18,000 organizations installed the trojanized version through normal software updates. The source code was clean. The build was not. Among the confirmed victims: parts of the U.S. Treasury, Commerce, State, and Homeland Security departments.

This is the most dangerous pattern because every conventional verification check — code review, source scanning, signed releases — still passes. The fix requires reproducible builds and build provenance attestations, which most projects still don't have.

What Works

After a decade of these incidents, there is a hierarchy of mitigations ranked by what actually moves the needle:

1. Pin everything by hash, not by version

Versions are mutable. A package==1.2.3 line in your requirements file does not guarantee you get the same package next month. Maintainers can re-publish, registries can be compromised, and even legitimate "patch" releases can introduce changes you didn't audit.

Hashes are immutable. If you pin package==1.2.3 --hash=sha256:abc..., your package manager will refuse to install anything that doesn't match. The torchtriton attack would have been ineffective against a hash-pinned environment.

bash

# Python
pip install --require-hashes -r requirements.txt
# Node
npm install --package-lock-only && commit package-lock.json
# Go
go mod download && commit go.sum
# Rust
cargo build  # Cargo.lock already does this by default

# Python
pip install --require-hashes -r requirements.txt
# Node
npm install --package-lock-only && commit package-lock.json
# Go
go mod download && commit go.sum
# Rust
cargo build  # Cargo.lock already does this by default

Every modern language has this capability. Most teams don't use it.

2. Pin GitHub Actions and CI dependencies to commit SHAs

The tj-actions compromise hit repositories that had written:

yaml

- uses: tj-actions/changed-files@v44   # mutable tag

- uses: tj-actions/changed-files@v44   # mutable tag

It did not hit repositories that had written:

yaml

- uses: tj-actions/changed-files@a284dc1814e4fe5fc6df59c4f1cbc7b1c7f0b3c2   # immutable SHA

- uses: tj-actions/changed-files@a284dc1814e4fe5fc6df59c4f1cbc7b1c7f0b3c2   # immutable SHA

The fix takes five minutes per repo. Dependabot will even open PRs to keep the SHAs updated. The reason most teams haven't done it: tags read like versions and feel safer than they are.

3. Treat build systems as production infrastructure

Your CI/CD pipeline has access to all of your secrets, all of your code, and is the last system that signs your binaries before customers run them. It is the most security-sensitive system in your company, and most teams treat it as a place to dump Bash scripts.

Concrete steps that close most of the SolarWinds attack surface:

Run builds in ephemeral, single-use containers — no persistent build agents
Restrict the network egress from build hosts (an Orion-style attacker needs to exfiltrate compiled binaries)
Sign release artifacts from a separate signing infrastructure with hardware-backed keys
Generate SBOM (Software Bill of Materials) and SLSA provenance attestations as part of every release
Reproducibility: if the same source produces a byte-different binary, something between source and artifact is unaccountable

4. Sandbox third-party code at runtime, not just at build time

If your application loads user-supplied plugins, third-party model files, untrusted serialized data, or executes any code your team did not author, sandbox it. Process isolation, container boundaries, restricted filesystems, no network access by default. The attack still runs — but it runs in a place where it cannot reach your AWS credentials or your SSH key.

This is not paranoid. Most modern frameworks have a deserialization-leads-to-RCE class of vulnerability somewhere in them. Treat untrusted input as untrusted code, even when the language insists it is data.

5. Read your dependency tree before you add to it

This is the unglamorous one. Most teams have no idea what they actually depend on. A standard React app pulls in 800+ transitive npm packages. A standard Python ML project pulls in 200+. Each is a maintainer who could be compromised, a build system that could be subverted, a registry that could be abused.

Tools that help: pip-licenses, npm ls, cargo tree, osv-scanner, trivy fs. Run them. Read the output. Be uncomfortable.

You probably do not need to audit all 800 packages. But you should know which 20 do something privileged — execute shell commands, write to filesystems, make outbound network calls, parse untrusted input — and you should pay closer attention to those.

Why This Keeps Happening

Every incident gets a postmortem. Every postmortem reaches the same conclusion: the tools to prevent it existed, the team didn't use them, and the next team won't either.

The reason is not negligence. It is that supply chain security has the worst possible economic profile for software teams:

The cost is concrete and immediate (CI gets slower, dependency updates get harder, developers complain)
The benefit is statistical and deferred (you might be the team that doesn't get owned)

This is the same profile as backups, monitoring, and disaster recovery — categories where most teams underinvest until the first major incident. Supply chain security is going through that same arc, except the attacks scale across the entire ecosystem instead of hitting one company at a time.

The teams that have done the work — pinned by hash, locked their build systems down, generated SBOMs, audited their dependency trees — are not the teams that got lucky. They are the teams that stopped treating the supply chain as someone else's problem.

The xz-utils backdoor was caught because one engineer cared about a 500-millisecond SSH login delay. The next one might not be.

Further reading: NIST's Secure Software Development Framework (SSDF), the SLSA (Supply-chain Levels for Software Artifacts) framework, and Andres Freund's original disclosure of CVE-2024–3094.

Contents