Back to all blogs

MITRE ATLAS Framework: A Practical Guide for AI Security Teams

MITRE ATLAS Framework: A Practical Guide for AI Security Teams

Archisman Pal, Head of  GTM

Archisman Pal

Archisman Pal

|

Head of GTM

Head of GTM

|

8 min read

MITRE ATLAS Framework: A Practical Guide for AI Security Teams
Repello tech background with grid pattern symbolizing AI security

TL;DR

  • MITRE ATLAS (Adversarial Threat Landscape for AI Systems) extends ATT&CK to cover AI and ML-specific attack vectors that traditional frameworks do not address

  • ATLAS organizes attacks into tactics (Reconnaissance through Impact) and 80+ techniques mapped to real-world case studies

  • The most operationally relevant techniques for LLM deployments are training data poisoning, ML software supply chain compromise, inference API exfiltration, and prompt injection

  • ARTEMIS automatically maps every red team finding to the corresponding MITRE ATLAS technique, eliminating manual crosswalk work

What is MITRE ATLAS?

MITRE ATLAS (Adversarial Threat Landscape for AI Systems) is a knowledge base of adversarial tactics, techniques, and case studies targeting machine learning systems. Published and maintained by MITRE, it follows the same structure as ATT&CK but covers attack surfaces that are specific to AI: training pipelines, inference APIs, model files, vector stores, and embedding models.

ATLAS was first released in 2021 and has expanded continuously as the AI attack surface has grown. The current version covers more than 80 techniques organized across 14 tactic categories, supported by case studies drawn from published security research and documented real-world incidents.

Each ATLAS entry follows the same structure as an ATT&CK technique: a unique identifier (AML.T followed by a four-digit number), a description of what the technique does, the prerequisites an attacker needs, and links to documented case studies. This structure makes ATLAS directly usable in threat models, red team planning, and security control gap analysis.

For AI security teams, ATLAS is the closest thing to an authoritative taxonomy of what adversaries actually do to ML systems. It does not cover every possible attack, but it is the baseline that detection coverage and red team scope should be measured against.

MITRE ATLAS vs MITRE ATT&CK: what is different for AI systems

ATT&CK covers adversarial behavior against traditional IT infrastructure: operating systems, networks, applications, and cloud environments. It was built from real-world intrusion data and incident reports going back decades. Its techniques are well-understood, tooling exists to map detections to them, and most enterprise security programs are structured around it.

ATLAS covers adversarial behavior against AI and ML systems. The attack surfaces it covers do not exist in ATT&CK: training datasets, model weights, embedding spaces, inference APIs, and the libraries that underpin model serving. An attacker who poisons a training dataset, extracts a model through repeated inference queries, or backdoors a fine-tuned model is operating entirely outside the ATT&CK taxonomy.

The two frameworks are not in competition. They cover different layers of the same stack. A supply chain attack that compromises a Python ML library (ATLAS: AML.T0048) may also involve persistence mechanisms and lateral movement (ATT&CK), depending on the attacker's objectives. In agentic AI systems that have access to filesystems, networks, and external services, both frameworks apply simultaneously.

The practical implication is that security teams running AI red teaming exercises need both frameworks. ATT&CK covers the infrastructure layer. ATLAS covers the AI layer. Mapping coverage to ATLAS alone misses infrastructure-level exposures; mapping to ATT&CK alone misses AI-specific attack vectors.

The MITRE ATLAS tactic categories

ATLAS organizes techniques into 14 tactic categories following the same kill-chain logic as ATT&CK. Six of these are the highest-priority for LLM and agentic AI deployments.

Reconnaissance

Attackers gather information about the target AI system before mounting an attack. Reconnaissance techniques in ATLAS cover querying model APIs to infer architecture and training data, probing for system prompt leakage, and scanning for exposed ML model endpoints. For deployed LLM applications, reconnaissance often happens silently via the inference API itself, leaving no trace in traditional application logs.

Resource Development

Attackers acquire the infrastructure and data needed to mount ML-specific attacks. This includes obtaining shadow ML systems that mirror the target model, acquiring training data to analyze for poisoning opportunities, and developing adversarial examples. Unlike ATT&CK resource development, the resources here are compute-intensive and often take weeks to develop.

Initial Access

ATLAS covers multiple paths into an AI system: exploiting public-facing model APIs, compromising ML software dependencies in the supply chain, and obtaining valid credentials for model management interfaces. The LiteLLM supply chain attack in March 2026 is a documented example of ATLAS Initial Access via ML software dependency compromise (AML.T0048).

ML Attack Staging

This tactic has no ATT&CK equivalent. It covers the preparation phase specific to ML attacks: crafting adversarial inputs, creating backdoored datasets, and engineering prompts designed to bypass model safety controls. Most of the pre-attack work in an AI red team exercise maps here. This is where prompt injection payloads, adversarial image perturbations, and training data inserts are developed and tested before deployment.

Exfiltration

ATLAS exfiltration techniques cover extraction paths unique to ML systems. Model inversion attacks extract sensitive training data through repeated inference API queries. Membership inference attacks determine whether specific records were in the training set. Indirect exfiltration via retrieved content (relevant to RAG deployments) uses the model as an unwitting data relay rather than accessing the underlying storage directly.

Impact

ATLAS Impact techniques cover what the attacker does once they have execution: degrading model accuracy through evasion attacks, corrupting model outputs for targeted queries, generating excessive API costs (Cost Harvesting, AML.T0034), and extracting or destroying model artifacts. In agentic systems where model outputs trigger downstream actions, Impact techniques can extend beyond the AI layer into business process disruption.

Key MITRE ATLAS techniques AI security teams should know

The ATLAS taxonomy covers 80+ techniques. These eight have the highest relevance for current LLM and agentic AI deployments.

Poison Training Data (AML.T0020)

The attacker injects malicious examples into training data before a model is trained or fine-tuned. The poisoned model behaves normally on most inputs but produces attacker-controlled outputs for specific trigger conditions. Clean-label variants generate poisoned examples that are indistinguishable from legitimate training data on inspection.

Backdoor ML Model (AML.T0018)

A backdoor is embedded in a model such that it behaves correctly under normal conditions but produces attacker-specified outputs when a trigger input is present. Backdoors can be introduced during training or via fine-tuning on a small poisoned dataset. They survive standard evaluation because the trigger inputs are not in the test set.

Publish Poisoned Datasets (AML.T0019)

The attacker releases a public dataset containing poisoned examples on a platform like Hugging Face or Kaggle. Downstream teams that incorporate the dataset unknowingly introduce the backdoor into their training pipeline. Dataset provenance is rarely audited as rigorously as code provenance, which makes this a reliable supply chain entry point.

Compromise ML Software Dependencies (AML.T0048)

The attacker compromises a library in the ML software supply chain: a training framework, a data processing utility, an inference server, or a vector database client. The LiteLLM incident (March 2026) and the TeamPCP campaign illustrate how a single compromised dependency with broad adoption produces a large blast radius.

Craft Adversarial Data (AML.T0043)

The attacker generates inputs designed to cause the model to produce incorrect outputs with high confidence. Adversarial examples for image classifiers are the canonical case, but the technique applies to text classifiers, embedding models, and multimodal systems. For LLM guardrail evasion, this includes the encoding and Unicode manipulation attacks documented in Repello AI's emoji injection research.

Exfiltration via ML Inference API (AML.T0024)

The attacker extracts sensitive information from a model by querying it in ways that cause it to reveal training data, system prompts, or information from retrieved context. For RAG-based systems, this combines with indirect prompt injection: an adversarially placed document instructs the model to output retrieved content to a channel the attacker controls.

LLM Prompt Injection (AML.T0051)

Instructions embedded in user inputs, retrieved documents, or tool responses cause the model to deviate from its intended behavior. Direct injection targets the user-facing prompt; indirect injection (AML.T0054) targets content the model retrieves or processes autonomously. This is the highest-volume AI attack technique in current deployments. See Prompt Injection Attack Examples for documented cases.

Cost Harvesting (AML.T0034)

The attacker sends high-volume or computationally expensive queries to a model API to generate costs for the operator. For metered deployments, this is a form of denial of wallet. For on-premises deployments, it degrades availability without disabling the system. It is frequently combined with adversarial inputs designed to maximize token consumption per query.

How MITRE ATLAS maps to the OWASP LLM Top 10

The OWASP LLM Top 10 classifies the most critical risks in LLM applications. MITRE ATLAS provides the adversarial technique taxonomy behind each risk. Using both together — OWASP for risk prioritization and ATLAS for technique mapping — gives security teams a complete picture of what they need to detect and test.

OWASP LLM Risk

MITRE ATLAS Technique(s)

Attack stage

LLM01: Prompt Injection

Craft Adversarial Data (AML.T0043), LLM Prompt Injection (AML.T0051), Indirect Prompt Injection (AML.T0054)

ML Attack Staging, Initial Access

LLM02: Insecure Output Handling

LLM Prompt Injection (AML.T0051) as a prerequisite

Impact

LLM03: Training Data Poisoning

Poison Training Data (AML.T0020), Publish Poisoned Datasets (AML.T0019)

ML Attack Staging

LLM04: Model Denial of Service

Cost Harvesting (AML.T0034)

Impact

LLM05: Supply Chain Vulnerabilities

Compromise ML Software Dependencies (AML.T0048)

Initial Access

LLM06: Sensitive Information Disclosure

Exfiltration via ML Inference API (AML.T0024)

Exfiltration

LLM07: Insecure Plugin Design

LLM Prompt Injection via tool responses (AML.T0051)

ML Attack Staging

LLM08: Excessive Agency

Indirect Prompt Injection (AML.T0054), agentic execution

Impact

LLM09: Overreliance

Erode ML Model Integrity (AML.T0031), Craft Adversarial Data (AML.T0043)

Impact

LLM10: Model Theft

Exfiltration via ML Inference API (AML.T0024), model inversion

Exfiltration

OWASP LLM01 (Prompt Injection) maps to three distinct ATLAS techniques at two different tactic stages. A detection or testing program that treats prompt injection as a single technique will miss the staging activity (adversarial content crafting) that precedes the actual injection. Red team exercises need to cover both.

Second, LLM05 (Supply Chain Vulnerabilities) maps directly to ATLAS Initial Access via AML.T0048. Supply chain attacks on ML dependencies are not a theoretical risk. The frequency of documented incidents in 2025 and 2026 makes this one of the highest-likelihood initial access vectors for AI systems, and it is outside the scope of most standard LLM security assessments.

Using MITRE ATLAS for AI threat modeling

ATLAS provides the technique vocabulary for AI threat modeling. The process starts with mapping the attack surface: every component of the AI system (training pipeline, model serving infrastructure, inference API, vector database, agent tool integrations) maps to a subset of ATLAS techniques. The output is a threat model that specifies which techniques are in scope for the deployment and which controls would detect or prevent each one.

For agentic systems, threat modeling with ATLAS requires covering techniques across both the AI layer and the infrastructure layer. An agent that can execute code, query external APIs, or write to a database has an expanded ATLAS-relevant attack surface that includes exfiltration paths not present in a standard LLM deployment.

Repello AI's open-source Agent Wiz tool performs MAESTRO threat modeling across 12 known agentic failure modes, generating agent-to-tool-to-LLM runtime graphs annotated with threat paths. It integrates with LangGraph, AutoGen, CrewAI, LlamaIndex, and the OpenAI SDK. The output maps directly to ATLAS techniques, providing a foundation for prioritizing red team scope.

For the complete threat modeling methodology, ATLAS case studies are the most underused resource in the framework. Each case study documents a real-world incident with technique mapping, providing the empirical grounding that makes threat models defensible to engineering and leadership stakeholders.

How ARTEMIS maps findings to MITRE ATLAS automatically

One of the practical friction points in AI red teaming is the crosswalk between tool findings and framework taxonomy. A red team exercise that produces a list of "model evasion" or "guardrail bypass" findings is not immediately actionable unless those findings are mapped to specific ATLAS techniques, which tells the security team which controls are missing and which detection gaps need to close.

ARTEMIS performs this mapping automatically. Every finding from an ARTEMIS red team engagement is tagged to the corresponding MITRE ATLAS technique and tactic. A finding of indirect prompt injection via a retrieved document is mapped to AML.T0054, placed in the ML Attack Staging tactic, and linked to the ATLAS case studies that document the same technique against other systems.

This has two practical implications. First, the remediation report is structured around ATLAS technique coverage, not just a list of test results. Security teams can directly compare ARTEMIS findings against their detection coverage for the same ATLAS techniques and identify the gaps. Second, regression testing after remediation is scoped by technique: closing an AML.T0020 finding requires demonstrating that the technique is no longer effective, not just that the specific payload that triggered it has been blocked.

The ARTEMIS approach treats ATLAS compliance as a continuous measurement, not a one-time audit. As new ATLAS techniques are added to reflect emerging attack patterns, the technique coverage map updates automatically for all deployments under continuous testing.

Frequently asked questions

What does MITRE ATLAS stand for?

MITRE ATLAS stands for Adversarial Threat Landscape for AI Systems. It is a knowledge base of adversarial tactics and techniques specific to machine learning systems, published and maintained by MITRE. It follows the same structural conventions as MITRE ATT&CK but covers attack surfaces that ATT&CK does not address: training pipelines, model weights, inference APIs, and ML-specific software dependencies.

How many techniques does MITRE ATLAS have?

As of early 2026, MITRE ATLAS covers more than 80 techniques organized across 14 tactic categories. The framework is updated continuously as new AI attack patterns are documented in research publications and real-world incidents. The full technique list is maintained at atlas.mitre.org.

Is MITRE ATLAS relevant to LLMs or only to traditional ML models?

ATLAS was originally built around attacks on traditional ML models (image classifiers, tabular models). It has expanded significantly to cover LLM-specific attacks, including prompt injection (AML.T0051), indirect prompt injection (AML.T0054), and training data poisoning for fine-tuned models (AML.T0020). For LLM deployments, the ML Attack Staging and Exfiltration tactic categories are the most operationally relevant.

How does MITRE ATLAS differ from OWASP LLM Top 10?

OWASP LLM Top 10 is a risk classification framework: it identifies the highest-risk vulnerabilities in LLM applications and provides guidance for mitigation. MITRE ATLAS is an adversarial technique taxonomy: it describes what attackers do and how, grounded in documented cases. They are complementary. OWASP tells you which risks to prioritize; ATLAS tells you which techniques to test for and detect.

Can I use MITRE ATLAS without also using ATT&CK?

For AI-only deployments with no network, OS, or cloud exposure, ATLAS alone covers the relevant attack surface. For most real-world deployments, AI systems run on infrastructure that ATT&CK covers. Agentic systems in particular often have access to file systems, external APIs, and internal services, making ATT&CK techniques directly applicable alongside ATLAS. Using both frameworks gives complete coverage.

ARTEMIS automatically maps every finding to MITRE ATLAS. No manual crosswalk required. Book a demo to see the full ATLAS technique coverage report for your AI deployment.

Share this blog

Share on LinkedIn
Share on LinkedIn

Subscribe to our newsletter

Repello tech background with grid pattern symbolizing AI security
Repello tech background with grid pattern symbolizing AI security
Repello AI logo - Footer

Sign up for Repello updates
Subscribe to our newsletter to receive the latest insights on AI security, red teaming research, and product updates in your inbox.

Subscribe to our newsletter

8 The Green, Ste A
Dover, DE 19901, United States of America

AICPA SOC 2 certified badge
ISO 27001 Information Security Management certified badge

Follow us on:

LinkedIn icon
X icon, Twitter icon
Github icon
Youtube icon

© Repello Inc. All rights reserved.

Repello tech background with grid pattern symbolizing AI security
Repello AI logo - Footer

Sign up for Repello updates
Subscribe to our newsletter to receive the latest insights on AI security, red teaming research, and product updates in your inbox.

Subscribe to our newsletter

8 The Green, Ste A
Dover, DE 19901, United States of America

AICPA SOC 2 certified badge
ISO 27001 Information Security Management certified badge

Follow us on:

LinkedIn icon
X icon, Twitter icon
Github icon
Youtube icon

© Repello Inc. All rights reserved.