Anthropic’s Mythos Test Shows Why AI Cyber Defense Is Becoming Classified Work

An Anthropic Mythos test with U.S. intelligence agencies reportedly found vulnerabilities in highly sensitive government systems within hours. The episode sharpens the policy problem around frontier AI: the same models that can help defenders fix critical software can also compress the timeline for attackers.
Anthropic launch artwork used for coverage of Claude Fable and Mythos frontier AI models.
Image: Anthropic

An Anthropic AI model identified vulnerabilities in highly sensitive U.S. government computer systems during a controlled test with intelligence agencies, according to an Associated Press report published by Federal News Network. The model, Anthropic’s Mythos, found the flaws within hours, though the official who described the exercise stressed that identifying vulnerabilities was not the same as exploiting them.

The test gives a more concrete shape to the argument that has surrounded Anthropic’s restricted cyber-capable models for weeks. Washington is not just debating abstract AI risk. It is trying to decide who should get access to models that can accelerate vulnerability discovery inside sensitive networks, how those models should be supervised, and whether restricting them weakens defenders more than it slows adversaries.

What The Test Reportedly Found

The AP report says Anthropic worked with U.S. intelligence agencies under Project Glasswing, the company’s defensive-security initiative for using Claude Mythos Preview against critical software. A U.S. official, speaking anonymously because of the sensitivity of the matter, said Mythos identified vulnerabilities in secure government systems within hours. The National Security Agency and Anthropic declined to comment to the AP.

Sen. Mark Warner had alluded to the exercise during a June 11 Senate Banking Committee hearing, saying the tool “broke into almost all of our classified systems” in hours rather than weeks and attributing that account to Gen. Joshua Rudd, who leads the NSA and U.S. Cyber Command. That phrasing is dramatic, but the later AP account is narrower: the model identified flaws during an authorized test, and the report does not establish that Mythos autonomously carried out real-world exploitation across classified networks.

That distinction matters because it separates two different risks. One is discovery speed: a model that can rapidly find weak points in hardened systems changes the economics of vulnerability research. The other is operational autonomy: a model that can chain discovery, exploit development, access, persistence, and lateral movement would raise a different and more severe set of deployment questions. The public reporting supports the first concern clearly and points toward the second as a policy fear, not as a confirmed public fact from this test.

Why Project Glasswing Matters

Anthropic announced Project Glasswing in April with partners including AWS, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorgan Chase, the Linux Foundation, Microsoft, NVIDIA, and Palo Alto Networks. The company described Mythos Preview as an unreleased frontier model that had already found high-severity vulnerabilities in major operating systems, web browsers, and other important software. Anthropic committed up to $100 million in usage credits for the program and $4 million in donations to open-source security groups.

The technical point is not simply that Mythos can read code. Security teams already use static analysis, fuzzing, dependency scanners, and human review. The difference, according to companies testing these systems, is that cyber-focused frontier models can reason across several small flaws, write proof-of-concept code, and help determine whether a suspected bug is reachable enough to matter.

Cloudflare’s Project Glasswing write-up described Mythos Preview as a step change in exploit-chain construction and proof generation. The company said the model was more useful when placed inside a structured vulnerability-discovery harness rather than pointed at a repository as a generic coding agent. Its process split work into reconnaissance, narrow parallel hunts, validation, gap-filling, deduplication, reachability tracing, feedback, and structured reporting.

That workflow detail is the useful lesson for readers outside national-security circles. The strongest AI vulnerability work is not a chatbot session where someone asks a model to “find bugs.” It is closer to an automated research pipeline: many narrow tasks, separate agents checking each other’s claims, compile-and-run proof environments, reachability analysis, and human-controlled triage before fixes move into production.

The Defense Problem Is Access, Not Just Capability

The reported classified-systems test lands in the middle of a live access fight. The Trump administration issued a directive earlier in June requiring Anthropic to block foreign nationals from using Fable 5 and Mythos 5, citing national-security concerns. Anthropic said it disabled the models for all customers to comply, while arguing that the government’s response was not warranted by the specific issue it had flagged.

Security leaders have pushed back against broad restrictions. The AP report notes that more than 100 cybersecurity experts and executives asked the administration to lift the directive, arguing that removing strong defensive tools could leave U.S. organizations at a disadvantage while other models continue to advance. Their argument is not that Mythos-class systems are harmless. It is that defenders need controlled access before attackers get similar capabilities through other frontier or open models.

Palo Alto Networks made a similar practical point in a May update from its own frontier-AI security testing. The company said it had used models including Mythos, Claude Opus 4.7, and OpenAI’s GPT-5.5-Cyber under trusted-access programs, and that most of its May Patch Wednesday findings came from frontier AI scanning. Its warning was blunt: organizations may have only a short window to fix exposure before AI-driven exploit discovery becomes a normal attacker capability.

That creates an uncomfortable policy tradeoff. Restrict access too broadly, and defensive teams lose time. Release access too loosely, and the same techniques that help Cloudflare, Palo Alto Networks, open-source maintainers, and government agencies could help less responsible actors find usable attack paths. The classified-system test makes that tradeoff harder to discuss as a generic AI-safety argument. It puts the question inside the systems that governments most want to protect.

What Security Teams Should Take From It

The immediate lesson is not that every company needs a Mythos-class model tomorrow. Most organizations cannot safely or effectively use one without the surrounding process: asset inventory, code ownership, build environments, test coverage, exploit validation rules, emergency patch paths, and humans who can decide whether a finding is real and worth acting on.

The more practical takeaway is that vulnerability management is becoming a speed problem and a reachability problem at the same time. If AI systems can produce better leads faster, then old queues of “medium” findings, unowned legacy services, exposed management interfaces, and slow release trains become easier for attackers to turn into working paths. The useful defensive response is not just faster patching. It is reducing what is reachable, segmenting systems so one flaw cannot become broad access, and making critical fixes deployable without bypassing regression testing.

For software teams, the Cloudflare-style harness model also points to a future where security testing becomes more continuous and more specialized. Instead of waiting for a quarterly penetration test or a scanner report, teams may run targeted AI-assisted reviews around command injection, memory safety, authentication boundaries, deserialization, dependency exposure, and code paths that touch sensitive data. Findings with proof code and reachability analysis will matter more than long lists of speculative issues.

The Mythos test does not prove that AI has made classified systems indefensible. It shows that defensive cyber work is moving into a phase where the best tools may be powerful enough to require national-security handling. The hard part is building an access model that lets trusted defenders use that speed before adversaries do, without turning vulnerability discovery into a broadly available weapon.

Leave a Reply

Your email address will not be published. Required fields are marked *

Previous Post
OpenAI knot logo on a black background

OpenAI Plans Staggered GPT-5.6 Release After White House Security Request

Next Post
Office workers using laptops in a flexible workspace, representing AI training and workforce transition programs.

RAISE US Launches $500M AI Workforce Push With OpenAI and Anthropic Backing

Related Posts
AWS Trainium3 AI chip on a circuit board

Amazon’s Trainium Talks Push AWS Chips Beyond the Cloud

AWS is in early talks to sell Trainium AI chips for use in other companies’ data centers, a shift that could move Amazon from cloud-only accelerator provider toward a more direct role in the AI chip market. The opportunity is real, but so are the constraints: Trainium capacity is already tight, Nvidia still owns the broadest software ecosystem, and selling racks outside AWS could weaken the cloud bundle that makes custom silicon so valuable to Amazon.
Read More