Back to all blogs

Red teaming vs. penetration testing vs. vulnerability scanning: what AI security teams actually need

Red teaming vs. penetration testing vs. vulnerability scanning: what AI security teams actually need

Archisman Pal, Head of  GTM

Archisman Pal

Archisman Pal

|

Head of GTM

Head of GTM

|

10 min read

Red teaming vs. penetration testing vs. vulnerability scanning: what AI security teams actually need
Repello tech background with grid pattern symbolizing AI security

TL;DR

  • Vulnerability scanning finds known CVEs automatically. It produces no signal against prompt injection, jailbreaks, or model-specific attacks.

  • Penetration testing is manual and point-in-time. It was not designed for probabilistic AI systems.

  • Red teaming is objectives-based adversarial simulation. Applied to AI, it requires a completely different attack taxonomy than infrastructure red teaming.

  • Breach and attack simulation (BAS) automates continuous TTP simulation. It is built for infrastructure, not AI.

  • AI red teaming is the only methodology purpose-built for the LLM and agentic attack surface.

Security teams tasked with securing AI often reach for familiar tools: vulnerability scanners, pentest engagements, red team exercises. These are mature disciplines with decades of methodology behind them. They are also not designed for the AI attack surface.

Prompt injection does not generate a CVE. A jailbreak cannot be found by a port scan. And a probabilistic model that refuses a harmful request 80% of the time will pass a point-in-time pentest if the tester happens to run it during that 80%. This guide defines each methodology clearly, identifies where each breaks down for AI systems, and explains what AI red teaming actually requires.

What is vulnerability scanning?

Vulnerability scanning is automated, passive enumeration of known security weaknesses. A scanner checks a system against a database of known CVEs and misconfigurations, reports what it finds, and produces a prioritized list of issues to remediate.

Scanners are fast and scalable. They are useful for maintaining hygiene across infrastructure: unpatched software, open ports, misconfigured services, exposed credentials. They run continuously without human involvement.

Their limitation is definitional: they only find what is in the database. A scanner cannot identify a novel attack path. Prompt injection, RLHF exploitation, persona attacks, and RAG poisoning do not have CVE entries. Vulnerability scanning produces no useful signal against the AI-specific attack surface.

What is penetration testing?

Penetration testing is a manual, adversarial, point-in-time assessment. A pentester attempts to gain unauthorized access to a defined scope of systems, documents the exploitation path, and delivers a findings report with evidence. The engagement runs within a bounded scope: specific systems, a defined timeframe, and agreed rules of engagement.

Pentesting finds logic flaws, authentication bypasses, injection vulnerabilities, and access control failures that automated scanning misses. A skilled pentester brings creative adversarial judgment that no scanner can replicate.

The structural constraint is its point-in-time nature. A pentest is a snapshot. For AI penetration testing, this creates a specific problem: LLMs are probabilistic. A jailbreak that works 30% of the time across 1,000 attempts will frequently pass a manual single-pass test. Characterizing the attack surface of a probabilistic system requires statistical sampling, not single-pass assessment.

Traditional pentesting also assumes the testing team knows what "unauthorized access" means. For AI systems, defining a successful exploit requires a different threat model: a harmful response, a policy bypass, a data disclosure through tool calls. Most pentest teams do not have that framework.

What is red teaming?

Red teaming originated in military and intelligence contexts as objectives-based adversarial simulation. Where penetration testing asks "can we get in?", red teaming asks "can we achieve this specific objective?" The objective might be: exfiltrate a target file, cause the system to take a harmful action, or demonstrate that a specific control fails.

In cybersecurity, red teaming typically involves a persistent authorized adversarial simulation that attacks people, processes, and technology simultaneously. The red team is less constrained than a pentester: it defines its own approach to achieve the stated objective. Engagements run longer, with higher operational complexity.

Applied to AI systems, red teaming requires a different attack taxonomy than infrastructure red teaming. Attacks against LLMs do not involve lateral movement through network segments; they involve multi-turn prompt manipulation, context injection, and exploitation of the model's instruction-following behavior. A generic red team skilled in network exploitation does not have the attack vocabulary for prompt injection, RLHF bypass, or agentic system compromise.

What is breach and attack simulation?

Breach and attack simulation (BAS) automates continuous simulation of attacker TTPs from the MITRE ATT&CK framework. BAS platforms run simulated attacks against production security controls around the clock, validating whether defenses actually work rather than assuming they do. Where a pentest validates a snapshot, BAS validates continuously.

BAS is effective for infrastructure security: testing whether endpoint detection catches a specific technique, whether a SIEM rule fires correctly, whether egress filtering blocks a simulated exfiltration.

The limitation for AI security is structural: MITRE ATT&CK covers infrastructure attack techniques. MITRE ATLAS was created separately to cover adversarial ML and AI-specific attack techniques. BAS platforms have not incorporated ATLAS-based attack playbooks at scale. BAS cannot test whether a production LLM is susceptible to prompt injection or whether a deployed agent can be hijacked through its tool-call layer.

Why none of these were built for AI systems

Each methodology breaks down against the AI attack surface for a predictable reason.

Vulnerability scanning requires a CVE database. AI vulnerabilities — prompt injection susceptibility, RLHF exploitation, model-specific jailbreaks — do not generate CVEs. There is no signature for "this model will comply with a persona attack."

Penetration testing assumes a deterministic target. LLMs are probabilistic: the same prompt produces different outputs across runs. Research from the University of Illinois Urbana-Champaign showed AI agents successfully exploited 87% of real-world CVEs when given access to tool-call capabilities. A point-in-time test run before that capability was added would have missed the entire attack surface it opened.

"Traditional security testing was designed for deterministic systems," according to the Repello AI Research Team. "LLMs are probabilistic. A jailbreak might succeed on 3 out of 10 attempts. A single-pass pentest misses it 70% of the time. Statistical sampling across hundreds of attack variations is the only way to measure actual risk."

Red teaming works conceptually but requires AI-specific expertise. The OWASP LLM Top 10 defines a different attack taxonomy than traditional red team training covers: prompt injection, indirect injection through external data sources, jailbreaking, model denial of service, sensitive information disclosure. Generic red teams apply infrastructure techniques that do not transfer.

BAS is the closest to a continuous testing approach in principle, but the tooling has not caught up. Current platforms test against MITRE ATT&CK. Testing against MITRE ATLAS at scale requires purpose-built AI testing infrastructure that most BAS vendors do not offer.

The deeper issue is that all four methodologies were designed for systems that produce predictable outputs given deterministic inputs. AI systems do not. An AI application's attack surface changes with every model update, system prompt change, and new tool integration. Continuous AI red teaming, not point-in-time assessment, is the only methodology that matches the pace of change.

What AI red teaming actually involves

AI red teaming covers the attack surface that traditional methodologies leave untested:

  • Prompt injection: direct injection through user inputs; indirect injection through documents, emails, web pages, and tool responses

  • Jailbreaking and persona attacks: RLHF exploitation, DAN-style persona reframing, multi-turn safety training erosion

  • RAG poisoning: adversarial content injected into the retrieval corpus to steer model outputs

  • Agentic attacks: tool-call hijacking, cross-agent instruction injection, MCP tool poisoning, agent identity abuse

  • Denial of wallet: input crafting to maximize token consumption and inference cost

  • Data exfiltration via tool calls: agent manipulation to retrieve and transmit sensitive data through legitimate integrations

Effective AI red teaming is context-specific. A fraud detection model, a customer service chatbot, and a coding assistant face different adversarial conditions. Applying a uniform taxonomy without calibrating to the target application's function, risk profile, and deployment configuration produces incomplete coverage.

Output should include exploitability evidence (not just "vulnerable in theory"), coverage mapped to OWASP LLM Top 10 and MITRE ATLAS, and prioritized remediation steps a security team can act on.

Comparison at a glance


Vuln scanning

Pentesting

Red teaming

BAS

AI red teaming

Automated

❌ / hybrid

Continuous

AI-specific coverage

Requires AI expertise

Handles probabilistic systems

Novel attack discovery

Framework

NVD / CVE

Varies

Varies

MITRE ATT&CK

OWASP LLM Top 10 + MITRE ATLAS

Output

CVE list

Findings report

Objectives assessment

Control gaps

AI risk report + remediation

Which approach does your AI security program need?

Vulnerability scanning is infrastructure hygiene, not an AI security program. Run it as a baseline. It is a prerequisite for everything else, not a substitute for it.

Penetration testing is useful for a one-time adversarial assessment before a new AI application goes live. Verify that the team has LLM-specific attack knowledge before engaging. A generic pentest against an LLM application will miss the majority of the actual attack surface.

Ongoing red teaming is necessary for any AI application that handles sensitive data, makes consequential decisions, or is exposed to adversarial users. The quantified case for this is in Repello's analysis of continuous red teaming: mean time to exploit has collapsed from 771 days to under 4 hours for newly published vulnerabilities. Point-in-time testing cannot keep pace with that.

BAS belongs in the infrastructure security stack. It validates that your controls work as configured. It does not replace AI-specific testing.

AI red teaming is necessary for any deployment of LLMs, agents, or AI systems that process untrusted inputs. This is the methodology built for the threat model your AI systems actually face.

Frequently asked questions

Is red teaming the same as penetration testing?

No. Penetration testing asks "can we gain unauthorized access?" within a bounded scope. Red teaming asks "can we achieve this objective?" with fewer constraints on method. Red teaming typically runs longer, involves broader scope, and tests people and processes alongside technology. For AI systems, the distinction matters further: AI red teaming requires an entirely different attack taxonomy than both traditional red teaming and traditional pentesting.

What is breach and attack simulation (BAS)?

BAS runs automated, continuous simulations of attacker TTPs from MITRE ATT&CK against production security controls. It validates that infrastructure defenses work continuously. Current BAS platforms are built for infrastructure, not AI application security. The AI-equivalent would be continuous automated red teaming against MITRE ATLAS and OWASP LLM Top 10 attack classes.

Do AI systems need a different type of penetration testing?

Yes. Traditional pentesters test deterministic application logic. LLMs are probabilistic and require statistical sampling across many attack variations, not single-pass testing. They also face an attack taxonomy — prompt injection, jailbreaking, RAG poisoning, agentic attacks — that is not covered by traditional pentest methodology. Teams running an AI pentest should verify the testing team has specific LLM adversarial expertise before engaging.

How often should AI systems be red teamed?

After every significant model update, system prompt change, or new tool integration — not on a fixed schedule. The attack surface of an AI application changes every time the underlying model or configuration changes. Continuous automated red teaming provides assurance between point-in-time assessments and catches regressions that quarterly engagements miss.

What is the difference between AI red teaming and AI safety testing?

AI safety testing evaluates whether a model produces harmful outputs under normal use. AI red teaming actively attempts to manipulate the model into harmful outputs through adversarial inputs. Safety testing finds natural failure modes; red teaming finds exploitable ones. Both are necessary, and the findings from red teaming should feed back into safety testing coverage.

Test your AI application with ARTEMIS

ARTEMIS is AI red teaming built for the AI attack surface: 15 million+ evolving attack patterns, context-specific to your application, covering OWASP LLM Top 10, NIST AI RMF, and MITRE ATLAS. Automated and continuous so coverage keeps pace with model updates. Get a demo.

Share this blog

Share on LinkedIn
Share on LinkedIn

Subscribe to our newsletter

Repello tech background with grid pattern symbolizing AI security
Repello tech background with grid pattern symbolizing AI security
Repello AI logo - Footer

Sign up for Repello updates
Subscribe to our newsletter to receive the latest insights on AI security, red teaming research, and product updates in your inbox.

Subscribe to our newsletter

8 The Green, Ste A
Dover, DE 19901, United States of America

AICPA SOC 2 certified badge
ISO 27001 Information Security Management certified badge

Follow us on:

LinkedIn icon
X icon, Twitter icon
Github icon
Youtube icon

© Repello Inc. All rights reserved.

Repello tech background with grid pattern symbolizing AI security
Repello AI logo - Footer

Sign up for Repello updates
Subscribe to our newsletter to receive the latest insights on AI security, red teaming research, and product updates in your inbox.

Subscribe to our newsletter

8 The Green, Ste A
Dover, DE 19901, United States of America

AICPA SOC 2 certified badge
ISO 27001 Information Security Management certified badge

Follow us on:

LinkedIn icon
X icon, Twitter icon
Github icon
Youtube icon

© Repello Inc. All rights reserved.