Zach Anderson
Apr 10, 2026 23:18
Anthropic releases safety tips as Challenge Glasswing reveals frontier AI fashions can now discover and exploit vulnerabilities sooner than human defenders.
Anthropic dropped a sobering evaluation this week: inside two years, AI fashions will uncover huge numbers of software program vulnerabilities which have sat unnoticed in code for years—and chain them into working exploits. The corporate’s safety groups launched detailed defensive suggestions alongside Challenge Glasswing, their initiative to deploy Claude Mythos Preview’s capabilities for cyber protection.
The maths right here is not sophisticated. If attackers can use frontier fashions to automate vulnerability discovery and exploit era, the window between a patch dropping and a working exploit showing shrinks dramatically. Anthropic’s safety engineers have watched this occur in their very own testing.
What Their Analysis Truly Discovered
In line with Anthropic’s technical findings, AI fashions excel at recognizing signatures of identified vulnerabilities in unpatched methods. Reversing a patch right into a working exploit—precisely the type of mechanical evaluation these fashions deal with effectively—used to require specialised abilities. Now it is turning into automated.
The corporate famous that publicly out there fashions beneath Mythos functionality ranges can already discover critical vulnerabilities that conventional code critiques missed for prolonged durations. Mozilla Firefox vulnerabilities found by means of AI scanning function one documented instance.
The Defensive Playbook
Anthropic’s suggestions prioritize controls that maintain even in opposition to attackers with limitless endurance and AI help. Friction-based safety measures—further pivot hops, charge limits, non-standard ports—lose effectiveness when adversaries can grind by means of tedious steps routinely.
Their prime priorities:
Patch velocity issues greater than ever. Web-facing functions ought to obtain patches inside 24 hours of an exploit turning into out there. The CISA Recognized Exploited Vulnerabilities catalog must be handled as an emergency queue. Anthropic recommends utilizing EPSS (Exploit Prediction Scoring System) for prioritizing every thing else.
Put together for 10x vulnerability report quantity. Over the subsequent two years, consumption and triage processes will face stress they’ve by no means skilled. Organizations nonetheless operating weekly spreadsheet conferences will not preserve tempo.
Scan your individual code with frontier fashions earlier than attackers do. This was Anthropic’s single most emphasised suggestion. Legacy code that predates present evaluation practices—particularly code whose authentic authors have moved on—represents the highest-value goal for proactive scanning.
Zero Belief Will get Actual
The steering pushes arduous towards hardware-bound credentials and identity-based service isolation. A compromised construct server should not attain manufacturing databases. A compromised laptop computer should not contact construct infrastructure.
Static API keys, embedded credentials, and shared service-account passwords are described as “among the many first issues an attacker with model-assisted code evaluation will discover.”
For Smaller Operations
Organizations with out devoted safety groups received particular recommendation: allow computerized updates in all places, favor managed providers over self-hosting, use passkeys or {hardware} safety keys, and activate free safety tooling from code hosts like GitHub’s Dependabot and CodeQL.
Open-source maintainers ought to anticipate elevated vulnerability report quantity—some precious, some automated noise. Publishing a SECURITY.md with clear consumption processes helps separate sign from spam.
Anthropic dedicated to updating this steering as Challenge Glasswing progresses. For enterprises monitoring SOC 2 and ISO 27001 compliance, most suggestions map on to present controls. The distinction now could be urgency.
Picture supply: Shutterstock
