Becoming Mythos-Ready: Your AI Security Program Needs a Readiness Test, Not a Patch Plan

TL;DR: In April 2026 the Cloud Security Alliance, with SANS, the OWASP GenAI Security Project, Bruce Schneier, and Jen Easterly, shipped a framework called "Building a Mythos-Ready Security Program." Its thesis is blunt: Anthropic restricted Claude Mythos to roughly 40 vetted organizations, but adversaries will have equivalent autonomous-exploit capability within 6 to 24 months, and the disclose-to-exploit window has already collapsed to machine speed. Mythos-ready means your program can survive an attacker who finds and weaponizes vulnerabilities on its own. Patching faster is necessary and insufficient. The only honest readiness test is to be attacked at machine speed first, in a controlled way. That is what autonomous AI red teaming is for, and below is the immediate, next-quarter, and 12-month cadence to get there.

The capability shift is real, and it is dated#

For the last decade, "an attacker found a bug" implied a human researcher with weeks to spend. That assumption is now wrong, and there is a public number on when it broke.

According to Anthropic's own red team assessment, Claude Mythos Preview went from a near-zero success rate at autonomous exploit development on the prior model to 181 successful autonomous exploit developments. It wrote working Firefox JavaScript shell exploits, built a FreeBSD NFS remote-code-execution attack that split a 20-gadget ROP chain across multiple packets, and bypassed KASLR and stack protection. Human validators agreed with its severity assessments in 89 percent of reviewed cases. Over 99 percent of the vulnerabilities it found had not been patched when the program was announced.

That is not an incremental improvement in fuzzing throughput. It is a qualitative change in who, or what, can turn a disclosed weakness into a working exploit. We covered the launch itself, and what the restricted-access decision signals, in our breakdown of what the Claude Mythos launch actually was. This post is the operational follow-up: not what Mythos is, but what you do now that it exists.

What "Mythos-ready" actually means#

The phrase is not ours. In April 2026 a coalition led by the Cloud Security Alliance, SANS, the [un]prompted community, and the OWASP GenAI Security Project published The AI Vulnerability Storm: Building a Mythos-Ready Security Program. The briefing was assembled over a single weekend by more than 60 named contributors, including Bruce Schneier, Jen Easterly, Chris Inglis, and Phil Venables, and reviewed by over 250 CISOs.

The framework's central claim is the one worth internalizing. Anthropic restricting Mythos to a defined set of vetted organizations buys a defensive window measured in months, not years. The CSA's working estimate is that adversaries reach equivalent autonomous-exploit capability within 6 to 24 months through other labs, open releases, or state-sponsored programs. Mythos-ready is the CSA's name for a program that can still function when that capability is in the hands of people targeting you.

A Mythos-ready program is not defined by a control you buy. It is defined by a question you can answer honestly: if an adversary could find and weaponize vulnerabilities in your stack autonomously, today, would your detection, response, and patching survive that timeline? For almost every organization the answer is no, and the gap is structural, not a tooling deficiency.

The window has already collapsed to machine speed#

The most important number in the CSA briefing is not about Mythos at all. It is about timing.

The briefing cites the Zero Day Clock, which puts the mean time from vulnerability disclosure to confirmed exploitation at less than one day in 2026, down from 2.3 years in 2019. Repello's own analysis of this collapse, in the zero-day collapse and the case for continuous AI red teaming, traces the same compression: a median that sat at 771 days in 2018 and fell to hours by 2024, with the majority of exploited vulnerabilities now weaponized before public disclosure.

Sit with what that does to a patch program. Vulnerability management, CVSS-based prioritization queues, quarterly red team engagements, and compliance-driven patch cadences were all designed for a world where defenders had a multi-week head start after disclosure. When the exploitation window is under a day, that head start is gone. The disclosure no longer marks the start of a race the defender can win. It marks the start of a race the defender has already lost if exploitation runs autonomously.

This is the trap in framing Mythos as a patching problem. A faster patch pipeline shortens your side of a race whose clock the attacker now controls. It helps, but it cannot close a window that opens and closes faster than human review.

Why patching faster is necessary and insufficient#

The CSA briefing includes an 11-item priority-actions table with deliberately aggressive timelines: the first action is immediate ("this week"), and the longest-horizon item is standing up a Vulnerability Operations function within 12 months. Most of those actions are sound, and most of them are about traditional vulnerability operations: deploy patches faster, automate the response pipeline, build infrastructure for rapid security updates.

For the software supply chain, that is correct. Memory-corruption bugs in the operating systems and libraries underneath you are exactly what Mythos demonstrated it can find, and faster patch deployment is the right answer there. But it is the wrong center of gravity for teams whose primary risk surface is the AI application itself.

If you ship an LLM-backed product, an agent with tool access, or a RAG pipeline reading a sensitive corpus, your highest-consequence vulnerabilities are not CVEs you patch. They are prompt injection, tool poisoning, indirect injection through retrieved content, and agentic chaining across connectors. There is no vendor patch for "the model followed a malicious instruction hidden in a document it retrieved." The fix is in your prompt architecture, your tool permissioning, your trust boundaries, and your guardrails, and the only way to know whether those hold is to attack them.

You cannot out-patch a machine-speed attacker, and for AI-native systems there is often nothing to patch in the traditional sense at all. The honest readiness test is different in kind: be attacked at machine speed first, on purpose, and fix what that surfaces.

The readiness test: be attacked at machine speed, on purpose#

Here is the logic the CSA framework points at without quite naming. If the threat is an adversary who finds and weaponizes vulnerabilities autonomously and continuously, then the only assessment that tells you whether you are ready is one that does the same thing to you first, under your control.

A point-in-time penetration test cannot do that. It produces a snapshot that is accurate the day it is delivered and stale the next time the model is updated, a tool is added, or a prompt is changed. An annual engagement run by a consulting firm answers "were we secure last quarter," which is not the question a machine-speed adversary forces. The question is "are we secure against this kind of attacker right now, on this version of the system," and that question has to be re-answered continuously because the system changes continuously.

This is what autonomous AI red teaming is for. Not a patch tool, and not a scanner that fires a generic probe list at an endpoint. A readiness test: continuous, context-specific adversarial testing that simulates a goal-directed autonomous attacker against your actual deployment, chains exploits across the model and its connectors and its tools, and surfaces the path-based vulnerabilities that single-prompt testing never reaches. The deeper distinction between bounded probing, goal-directed red teaming, and contractual pentests is worth understanding before you scope this work, and we lay it out in adversarial testing versus red teaming versus pentesting.

Repello's ARTEMIS is the operational embodiment of that test. It runs continuous, context-specific adversarial campaigns against AI applications and agents, drawing from an evolving attack library mapped to OWASP LLM Top 10, NIST AI RMF, and MITRE ATLAS, and it covers the agentic and MCP attack surfaces that traditional penetration testing does not reach. The point is not the tooling. The point is the posture: a readiness test that runs on the adversary's timeline, not the auditor's.

The agentic surface is where this matters most. MITRE ATLAS added MCP-compromise case studies in its January 2026 update, documenting attacks against Model Context Protocol infrastructure, indirect injection through MCP channels, and malicious agent deployment. An agent that books travel, writes to a repository, reads incoming mail, or calls production APIs has a blast radius that includes everything those tools touch. The reasoning chains and trust boundaries in agentic AI browser attacks are exactly the surface an autonomous attacker walks, and exactly the surface a readiness test has to walk first.

A Mythos-ready cadence for AI-native teams#

The CSA structure, immediate action through a 12-month build-out, is the right skeleton. What follows reframes it for teams shipping AI applications and agents rather than managing a traditional patch estate. Treat this as the checklist a security leader can act on, not a reading exercise.

Readiness cadence across three time tiers, comparing the traditional VulnOps action with the AI-native readiness action in each. This week (immediate): traditional VulnOps deploys patches faster and triages the CVE backlog; AI-native readiness enumerates every model, agent, RAG pipeline, and MCP connector and runs a baseline adversarial assessment against the top target. Next quarter (90 days): traditional VulnOps automates the patch pipeline and builds rapid-update infrastructure; AI-native readiness wires continuous adversarial testing into the deploy pipeline, maps findings to OWASP LLM Top 10, MITRE ATLAS, and NIST AI RMF, and re-tests the closed gaps. Within 12 months: traditional VulnOps stands up a Vulnerability Operations function for the patch estate; AI-native readiness runs continuous autonomous red teaming as a standing function across every AI asset, reported as a board-level readiness metric. Moving down the tiers, the AI-native column is what closes the machine-speed gap. — At every horizon the patch-centric action shortens your side of a race whose clock the attacker controls; the AI-native action runs the readiness test on the adversary's timeline first, and by 12 months that means continuous autonomous red teaming as a standing function.

Immediate (this week)#

Enumerate the AI attack surface. List every model, LLM-backed feature, agent, RAG pipeline, and MCP connector in production. You cannot run a readiness test against systems you have not named, and shadow AI is the rule, not the exception. This is the inventory step the rest of the cadence depends on.
Rank by blast radius. For each system, write down what it can touch: tools, data, downstream actions, external recipients. The system with tool access to production and regulated data in context is your highest-consequence target, and it is where the first readiness test runs.
Run a baseline adversarial assessment against the top target. Not a generic scan. A context-specific run that probes prompt injection, indirect injection, tool abuse, and chaining against that one system. The result is your readiness gap, measured rather than assumed.

Next quarter (the 90-day build)#

Move from point-in-time to continuous. Wire adversarial testing into the deployment pipeline so every model update, prompt change, and new tool re-triggers the relevant attack coverage. A readiness test that runs once is a snapshot; a readiness test that runs on every change is a posture.
Map findings to a framework your auditor reads. OWASP LLM Top 10 for what to test, MITRE ATLAS for how the techniques sequence, NIST AI RMF for what to document. Mapped findings are reviewable in a way a proprietary methodology is not, and the mapping is what turns a red team result into audit evidence.
Recalibrate response away from CVE timelines. Your incident-response runbooks assume human-speed discovery. Rewrite the AI-specific ones to assume an attacker that found the path autonomously, and rehearse the response against that timeline.
Close the highest-consequence gaps from the baseline. Tool permissioning, trust-boundary enforcement, guardrail coverage, and prompt-architecture fixes for the exploited paths. Re-run the test to prove the fix held, because remediation you have not re-tested is a hypothesis.

Within 12 months (the standing function)#

Stand up a continuous AI red team function. The CSA calls the traditional version a Vulnerability Operations function; the AI-native version is a standing adversarial-testing capability that runs without waiting for an audit window. Whether staffed, automated, or both, its job is to keep the readiness test running as the system evolves.
Bring every AI asset under continuous test. Extend coverage from the first high-consequence target to the full inventory, so no production model or agent sits untested against current attack patterns. The map you built in week one becomes the coverage checklist here.
Make readiness a board-level metric. Report the gap between systems under continuous adversarial test and systems in production. That ratio is the honest answer to "are we Mythos-ready," and it is the number leadership should watch. The pillar guidance in our complete guide to AI red teaming covers how to scope and resource this function in depth.

The cadence is not exotic. It is the same shape the CSA prescribes, pointed at the attack surface that actually defines risk for an AI-native company. The only genuinely new requirement is the one that ties it together: a readiness test that runs continuously, on the adversary's clock.

The gap is a process gap, not a tooling gap#

It is tempting to read all of this as "buy a better scanner." That misreads the problem. Most organizations that are not Mythos-ready are not short a tool. They are running a program whose entire cadence, quarterly assessments, annual pentests, patch cycles measured in weeks, was calibrated for an adversary that no longer exists.

The CSA briefing makes the same point from the governance side: it ships a 13-item risk register mapped to OWASP LLM Top 10, the OWASP Agentic Top 10, MITRE ATLAS, and NIST CSF 2.0, and a set of diagnostic questions for CISOs to triage their own program. The register and the diagnostics are valuable precisely because they force the honest question that tooling lets you avoid: not "do we have a red team," but "does our red team run on the timeline the threat actually operates on."

Closing that gap does not require getting inside Project Glasswing. It requires accepting that assumptions about attacker speed are now wrong by default, and rebuilding the testing cadence around that. The organizations that become Mythos-ready will be the ones that ran the machine-speed test on themselves before someone else ran it on them. Book a demo if you want to see what that readiness test surfaces against your own stack.

FAQ#

What does Mythos-ready mean?#

Mythos-ready is the term the Cloud Security Alliance introduced in its April 2026 strategy briefing "The AI Vulnerability Storm." A Mythos-ready security program is one that can withstand an adversary who finds and weaponizes vulnerabilities autonomously, at machine speed, rather than at the human-researcher speed that patch cycles and quarterly assessments were built for. It is a readiness posture, not a single tool or control.

Do I need access to Claude Mythos to become Mythos-ready?#

No. Anthropic restricted Mythos to roughly 40 vetted organizations under Project Glasswing, but the CSA framing is that equivalent autonomous-exploit capability will reach adversaries within 6 to 24 months regardless. Mythos-ready is about preparing your own program for that capability, not about getting access to one specific model. The work is the same whether the attacking model is Mythos, a future open release, or a state-sponsored equivalent.

Why is patching faster not enough?#

Faster patching is necessary but it does not change the structural problem. When the mean time from disclosure to exploitation falls below one day, no human-paced patch pipeline closes the window before an autonomous attacker walks through it. You cannot out-patch a machine-speed adversary. The only honest readiness test is to be attacked at machine speed first, in a controlled way, and fix what that surfaces before the real attacker runs the same test.

How does autonomous AI red teaming make a program Mythos-ready?#

Autonomous AI red teaming runs continuous, context-specific adversarial testing against your AI applications and agents on the same timeline an autonomous attacker would use. It finds the chained exploits, tool abuses, and injection paths in your actual deployment before the adversary does. That is the operational embodiment of Mythos-ready: the readiness test, run continuously, rather than a point-in-time audit that goes stale the day the system changes.

What is the difference between this and the CSA's patch-management guidance?#

The CSA's priority-actions table is framed largely around traditional vulnerability operations: faster patch deployment, a VulnOps function, rapid update infrastructure. That is correct for the software supply chain. It is incomplete for teams shipping AI-native applications, where the vulnerabilities are prompt injection, tool poisoning, and agentic chaining rather than memory-corruption CVEs. Mythos-readiness for AI-native teams means continuous adversarial testing of the model, its connectors, and its tools, mapped to OWASP LLM Top 10 and MITRE ATLAS.

Where do I start if my program is not Mythos-ready today?#

Start with inventory and a baseline adversarial run. You cannot defend AI applications you have not enumerated, so map every model, agent, RAG pipeline, and MCP connector in production first. Then run a continuous adversarial assessment against the highest-consequence system, the one with tool access or regulated data, and treat the findings as your readiness gap. The immediate, next-quarter, and 12-month cadence in this post is the structure to work through after that baseline.