5 Key Insights from Project Glasswing’s Initial Findings

Project Glasswing Findings | Claude Mythos Preview | AI Cybersecurity 2026

Read here for FREE

In one month, an AI model found over 10,000 high or critical severity security flaws across software that billions of people use every day.

That is not a prediction. That already happened. And understanding what it means requires stepping back from the headline number and looking at what the findings actually show.

Project Glasswing is Anthropic's coordinated effort to find and fix software vulnerabilities before bad actors can. The model doing the scanning is Claude Mythos Preview, an unreleased frontier model that Anthropic has restricted from public access. About 50 partner organizations including Microsoft, Apple, Google, Mozilla, Cloudflare, and others were given access, along with the task of scanning their own codebases and the open source software that underpins them.

Here are five things the initial results actually tell us.

1. The Numbers Are Hard to Dismiss

Over 10,000 high or critical severity vulnerabilities found in roughly a month. 6,202 were flagged as high or critical across more than 1,000 open source projects. Of those, 1,752 were independently assessed by six external security firms. Over 90% turned out to be real.

To put that in perspective: Cloudflare alone found 2,000 bugs, 400 of which were rated high or critical. Their team described the false positive rate as better than human testers.

Mozilla used Mythos to test Firefox and found 271 vulnerabilities. That is more than ten times what a previous Claude model found in an earlier version of the same browser.

Palo Alto Networks said the model accomplished the equivalent of a year's worth of penetration testing in under three weeks.

These are specific claims from companies with their own security teams. They have every reason to be skeptical.

2. It Does Not Just Find Bugs. It Chains Them.

This is the part that tends to get buried under the raw counts.

Finding a single bug in a codebase is useful. Finding four separate bugs and then combining them into a single exploit chain that bypasses both a browser's rendering sandbox and the operating system's own defenses is something different entirely.

That is what Mythos Preview demonstrated. Anthropic's red team documentation describes engineers with no formal security background asking the model to find remote code execution vulnerabilities overnight. They woke up to a working exploit.

One specific case: the model found a critical flaw in the wolfSSL cryptography library (now assigned CVE-2026–5194) and then built an exploit that could forge digital certificates. In practice, that means a fake bank website could appear completely legitimate to a user's browser. The vulnerability has since been patched.

The line between finding a bug and weaponizing it is now very thin.

A 27-year-old denial-of-service bug in OpenBSD. A 16-year-old out-of-bounds write in an FFmpeg video codec. A FreeBSD remote code execution flaw that granted full root access without authentication, assigned CVE-2026–4747, patched within 36 hours of discovery. These are not edge cases. These are foundational pieces of software that have been reviewed by human experts for decades.

3. Finding Bugs Is No Longer the Hard Part

This is probably the most underreported insight from the initial findings.

Before Glasswing, the main constraint in software security was time and human capacity. Finding a serious vulnerability in a mature codebase could take weeks. Now it takes hours. Or overnight.

The bottleneck has shifted entirely.

As of late May 2026:

530 high or critical severity vulnerabilities had been disclosed to open source maintainers
Only 75 had been patched
The average patch takes two weeks to develop and deploy
Some maintainers have asked Anthropic to slow down because the volume of incoming reports is overwhelming their capacity

That last point is worth sitting with. Anthropic built a private coordination portal for Glasswing partners to share findings and accelerate upstream patches. It helped. But it did not solve the underlying problem, which is that the organizations responsible for maintaining software that runs critical infrastructure are often small teams of volunteers running on limited budgets and no SLAs.

The discovery gap is closed. The remediation gap is very much open.

4. Most of the Findings Are Not Being Made Public Yet

Anthropic is operating under a 90-day coordinated vulnerability disclosure policy. That means findings stay private until patches are deployed, to prevent attackers from exploiting known flaws before a fix is available.

That is responsible and reasonable. It also means the public picture is incomplete.

As of the initial update, Anthropic said it planned to disclose another 827 high or critical severity bugs on top of the 530 already shared. With over 10,000 total findings and most still within their disclosure window, what is currently visible represents a small fraction of what was actually found.

This matters because the public CVE count, the advisories, and the patch notes only tell part of the story. When the 90-day windows close and full technical reports start coming out, the actual scale of the findings will likely look larger than the current numbers suggest.

5. The Model Is Not Available to the Public and That Is Not an Accident

Anthropic has been explicit about why.

Claude Mythos Preview is capable enough that Anthropic itself has said no existing safeguards are strong enough to prevent it from being misused if released publicly. A researcher with no security background used it to produce a working remote code execution exploit in a single session. That is not something you can put a few guardrails around and call it safe.

The current structure of Glasswing, a closed consortium of vetted organizations, coordinated disclosures, restricted access, is the product of that concern. It gives defenders a window to find and fix vulnerabilities before similar capabilities become available to everyone else.

Anthropic has said it plans to expand Glasswing to roughly 150 organizations across 15 or more countries, targeting power, water, healthcare, and communications infrastructure. Partners now include Okta, Samsung, SK Hynix, NATO, and the EU's cybersecurity agency ENISA. The company estimates a successful attack on these systems could affect over 100 million people.

Making Mythos-class models broadly available is still on the roadmap. But only after stronger safety measures are developed. For now, the access is controlled, and the disclosures are coordinated.

The thing that stands out across all five of these points is not that AI found a lot of bugs. It is that the infrastructure built around managing software vulnerabilities, the disclosure timelines, the patch cycles, the maintainer model, was designed for a world where finding bugs was the slow, expensive part. That constraint no longer exists. Everything downstream of discovery is now the constraint, and very little of it was built to move fast.

That is the actual story from Project Glasswing's first month.