ayush
Tags

© 2026 Ayush Sharma. Built with care.

All posts
#ai#security#essay

Project Glasswing and the Open-Weights Problem

Anthropic locked Claude Mythos behind a $100M vetted-access program. Meanwhile, DeepSeek V4 is on Hugging Face for anyone to download. The policy and the threat model point in different directions.

May 11, 2026·9 min read
Dark cover with amber glow and white title text

Three weeks ago I tried to get access to Claude Mythos. I work on security tooling. I have concrete defensive use cases. I wrote a clear proposal. Anthropic said no.

My company is not on the list. The list has Amazon, Apple, Google, Microsoft, Nvidia, CrowdStrike, Palo Alto Networks, Broadcom, Cisco, JPMorgan Chase, the Linux Foundation, and roughly 40 others. Vetted partners in Project Glasswing. About $100 million in usage credits committed. A closed group authorized to access a model Anthropic describes as too dangerous for public release.

The day I got the rejection, DeepSeek V4 was sitting on Hugging Face. Anyone with a GPU and a Hugging Face account could download it. The V4 Flash API is public at $0.14 per million input tokens. No application process. No vetting. No $100 million prerequisite.

This is the problem with the current policy response to Mythos. Not that the access restriction is wrong. The model is genuinely alarming, and Anthropic's caution is real. The problem is that the restriction addresses a threat model that no longer maps cleanly to the actual threat landscape.

What Mythos can do

The capability at issue is specific. Mythos is not dangerous because it writes better marketing copy or passes the bar exam. It is dangerous because it can find software vulnerabilities at scale, faster and more thoroughly than any prior tool.

Anthropic published the preview announcement in early April. The numbers are concrete. Mythos found thousands of high-severity vulnerabilities during internal red-teaming, including critical flaws in every major operating system and web browser. The UK AI Security Institute ran it against a 32-step adversarial cyber range called "The Last Ones," a challenge designed to be unsolvable end-to-end by AI. Mythos completed it 3 times out of 10. No prior model had completed it at all.

That benchmark matters because "The Last Ones" requires chaining multi-step offensive actions: reconnaissance, privilege escalation, lateral movement, exfiltration. It is not a question-answering task with a cyber flavor. It is a simulation of how a real intrusion unfolds. A model that can solve it even 30% of the time changes the economics of offensive security in a way that is hard to overstate.

The researchers who tested it, per the Foreign Policy writeup, described it as crossing a threshold from "AI that assists human attackers" to "AI that can execute significant portions of a campaign autonomously." That is the capability Anthropic chose not to release publicly. I understand why.

The policy response

The White House took notice. By May 5, the Trump administration was actively considering mandatory pre-release federal review for frontier AI models, with Mythos cited as the proximate cause. The proposed process would run through the Center for AI Standards and Innovation at the Department of Commerce, creating a review window before any frontier model reaches the public.

Google, Microsoft, and xAI announced the same week that they would share unreleased models with the government voluntarily, framed as a partnership on cybersecurity risk. The policy machinery is moving in one direction: more oversight, earlier in the development cycle, for the major American labs.

Policy analyst Dean Ball called it accurately: the US has created an informal AI licensing regime without passing a law. Not through legislation but through norms, voluntary commitments, and regulatory pressure on the labs that operate transparently within American jurisdiction.

The logic of the regime is coherent on its own terms. If the most capable model in the world can find critical zero-days autonomously, you would prefer that the first wave of access goes to defenders, not attackers. The vetted-access model tries to enforce that preference. The government review process tries to validate the safety posture before broader release. These are not stupid ideas.

Where the model breaks

Here is the part I keep returning to.

On April 24, DeepSeek released V4. Open weights. Hugging Face. Simon Willison has a clear breakdown: the Pro variant benchmarks within a few points of GPT-5.5 on coding and reasoning tasks. Not at the ceiling, but close. The Flash variant costs $0.14 per million input tokens via API. There is no access application.

DeepSeek V4 is not Mythos. But the question is not whether they are identical. The question is whether the marginal capability between V4 and Mythos is the deciding factor in who can execute a serious attack. For most attack scenarios, it is not. A capable, motivated adversary (a nation-state security agency, a well-funded criminal organization, a skilled independent researcher with the wrong intentions) does not need the top 0.5% of capability. They need the top 5%.

Project Glasswing restricts access to the top 0.5% for vetted partners. It does nothing about the top 5% that is already downloadable.

The threat model of the proposed policy assumes the primary risk is an American lab releasing a dangerous model to the public. That was a reasonable assumption two years ago, when the largest Western labs controlled the capability ceiling and open-weights models lagged by a year or more. The assumption does not hold in May 2026. The Chinese open-weights ecosystem (DeepSeek, Qwen, Kimi, MiniMax) has compressed that lag to roughly three to six months on most benchmarks. And the models are public.

A pre-release federal review process applied to American frontier labs will catch Google and Anthropic and OpenAI. It will not catch anything running on a downloaded checkpoint fine-tuned by someone outside US jurisdiction. The policy taxes the transparent, compliant actors in a way that gives them no advantage over actors who face no such tax.

Who gets left out

There is a secondary problem that gets less coverage. When Anthropic restricts Mythos to roughly 50 large organizations, it is not only restricting attackers. It is restricting defenders.

Security researchers at universities, independent firms, small boutique red teams, individual engineers doing defensive work on critical infrastructure: none of them have a path to Project Glasswing. The capability gap between what Mythos can find and what any of them can find using publicly available tooling is real. Some of those researchers are working on the same vulnerabilities Mythos would find automatically. They will be slower.

There is a reasonable response to this: you cannot give everyone access to a model that can find critical zero-days, because some of those people will use it offensively. True. The issue is that the vetting criteria for Glasswing were not drawn around "who will use this defensively." They were drawn around "who is a major technology company or institution." Those are correlated, but not the same thing.

JPMorgan Chase is on the list. A CISA-funded university security lab doing open research on critical infrastructure vulnerabilities is not. The organizational size criterion was probably chosen for tractability: large organizations are easier to vet and hold accountable. The security benefit of that choice is less clear.

What a coherent policy might look like

I do not have a clean answer. The capability Mythos represents is genuinely new and I would not pretend otherwise.

But a few things seem directionally correct.

The pre-release review process the White House is considering should acknowledge its scope explicitly. It will reduce risk from compliant Western labs. It will not reduce risk from open-weights models, from Chinese labs operating outside US jurisdiction, or from actors running fine-tuned checkpoints on private hardware. If it gets sold as a broad AI safety policy, it is overselling itself. If it gets sold as "we want to know what the major American labs are building before it ships," that is narrower and more defensible.

The Glasswing vetting criteria should be revisited. The barrier to entry should be about demonstrated defensive use, accountability for misuse, and technical capacity, not primarily about company size or brand recognition. That is harder to administer. It is more accurate.

The deeper problem is that "restrict the top model to vetted partners" is not a durable strategy when the capability is proliferating in open weights. The policy goal should be ensuring defenders get the capability first, not ensuring attackers do not get it at all. The second goal is probably already out of reach. Pursuing it as the primary goal misallocates the effort.

What I am uncertain about

Mythos may have capabilities that meaningfully exceed what V4 or anything else can do today, and those marginal capabilities may be the decisive ones for the worst-case scenarios. The "cyber range solved 3/10 times" benchmark is reported; I cannot validate its precise structure or what the counterfactual looks like with a near-frontier open-weights model. And the argument that even imperfect containment buys time for defensive deployment is not without merit. Buying time has value.

I am also uncertain about the pace of open-weights convergence. If Mythos represents a genuine capability cliff rather than a steep hill, the gap between it and V4 may be more meaningful than the benchmark numbers suggest. Autonomous end-to-end exploitation is a different beast from benchmark performance on coding tasks. It is possible that Anthropic has found a threshold that open-weights models will not cross for another year or two, and that threshold is the one that matters.

But even granting that: the policy does not track that threshold. It tracks organizational size and nationality. Those are not the same thing as the capability gap Anthropic is trying to contain. A policy designed to gate on capability would look different. It would need to be dynamic, updating as the open-weights frontier moves. It would need to acknowledge when the restriction no longer provides meaningful protection. The current proposal has none of that architecture.

What I am skeptical of is the framing that has emerged in the last two weeks: that the pre-release review process and the Glasswing model together constitute a meaningful security posture against the class of threat Mythos represents. They may delay the worst-case scenario by months. They do not address the structural issue, which is that frontier offensive AI capability is now available on Hugging Face.

The question defenders should be sitting with is not "will we get access to Mythos eventually." It is: given that our adversaries may already have near-equivalent capability and we do not, what do we build?

That is a harder question. The current policy is not designed to answer it.

On this page

  • What Mythos can do
  • The policy response
  • Where the model breaks
  • Who gets left out
  • What a coherent policy might look like
  • What I am uncertain about

Found this useful? Share it, or send a note.

PreviousThe Claude Code Config That Changed How I WorkNext GPT-5.5 Made the Mythos Restriction Obsolete