Back to all blogs

|
|
9 min read


TL;DR: AI risk assessment is the structured process of identifying, analysing, and prioritising the risks introduced by AI systems across their full lifecycle. Most organisations doing this today are either running no formal process at all, or running a standard IT risk assessment methodology that wasn't built for ML systems and misses the attack surfaces that matter: training pipeline exposure, probabilistic failure modes, and agentic blast radius. This post is a seven-step operational framework for security and compliance teams, paired with a practical comparison of the regulatory frameworks (NIST AI RMF, EU AI Act, OWASP LLM Top 10, Google SAIF) that determine what your documentation obligations are.
Most organisations deploying AI right now are doing one of two things with risk. They are either running no formal AI risk assessment process at all, treating each AI deployment as a software project and managing it through standard change control. Or they are taking an IT risk framework built for deterministic systems and applying it unchanged to a probabilistic one, producing a risk register that captures the infrastructure risks (unpatched OS, open firewall port, misconfigured IAM role) and misses entirely the risks that make AI deployments uniquely dangerous.
The gaps are not subtle. A vulnerability scanner can tell you whether the container running your LLM inference service has a known CVE. It cannot tell you whether a prompt injection attack against that service can exfiltrate the contents of every document in your connected SharePoint tenant. A standard IT risk register models assets as having vulnerabilities that either exist or do not. It has no column for "succeeds 12% of the time and produces a credential exfiltration path when it does." And it has no concept of a training pipeline as an attack surface, or of blast radius as a risk dimension that changes based on what tools an agent has been granted access to.
This post provides a precise definition, a seven-step operational framework, and a comparison of the regulatory frameworks that determine what risk assessment documentation you are required to produce.
What Is AI Risk Assessment?
AI risk assessment is the process of identifying, analysing, and prioritising the risks introduced by AI systems across their full lifecycle: from training data provenance and model selection through deployment, ongoing use, and eventual decommissioning.
Two distinctions matter for teams setting up their process:
AI risk assessment vs. AI governance. AI governance is the policy and accountability structure: who owns each AI system, what usage policies apply, who is responsible for compliance, what oversight mechanisms are in place. Risk assessment is the analytical input that makes governance meaningful. Governance without a credible risk assessment is policy without data.
AI risk assessment vs. AI risk management. Risk assessment is the foundational analytical step: identify and prioritise the risks. Risk management is the ongoing operational process: track, treat, monitor, and re-assess as systems evolve. You cannot run an effective risk management program without first completing a risk assessment, and a one-time assessment without a continuous management process decays as the deployment changes.
AI security solutions that address specific risk categories (guardrails, red teaming, runtime monitoring) are controls within a risk management program. They are most useful when targeted at risks that have been explicitly identified and prioritised through a prior assessment process.
Why Standard IT Risk Frameworks Don't Cover AI
Three structural gaps explain why applying a standard IT risk methodology to an AI deployment produces a risk register with significant blind spots.
Probabilistic failure modes. IT risk frameworks assume deterministic systems. A firewall misconfiguration either permits unauthorised traffic or it does not. A SQL injection vulnerability either exists in the codebase or it has been patched. AI risk is statistical. A jailbreak technique that succeeds 8% of the time is a real, exploitable risk requiring treatment. Standard CVSS scoring has no mechanism for expressing this: a vulnerability is scored as if it reliably exists or reliably does not. An AI risk register that only records binary vulnerabilities will systematically underrate the attack classes that define AI security.
Training pipeline exposure. Standard IT asset inventories do not include training datasets as attack surfaces, because standard software does not have training datasets. Data poisoning attacks inject risk through the training pipeline: an attacker who can influence what goes into a training dataset can corrupt model behaviour in targeted ways that are invisible on standard benchmarks and undetectable by infrastructure scanning. Most IT risk registers have no row for "training data provenance" and no methodology for assessing the risk introduced by a fine-tuned model sourced from a third-party repository.
Emergent and contextual behaviour. A model that behaves safely as a standalone chatbot may behave unsafely when connected to a file system, email client, or external API. The risk is not a property of the model in isolation; it is a property of the model in its deployment context. Research from the University of Illinois Urbana-Champaign found that LLM agents can autonomously exploit real-world vulnerabilities with an 87% success rate when given access to tools without sufficient controls. A static vulnerability score assigned to the model at procurement tells you nothing about what the model can do when it is granted write access to your production database at runtime.
The AI Risk Assessment Framework: Step by Step
Step 1: AI Asset Inventory
You cannot assess risks you have not mapped. The first step is a complete inventory of every AI system in your environment.
For each AI asset, document: the model type (foundation model API, self-hosted open-weights model, custom fine-tuned model, embedded AI feature in a SaaS product); the deployment context (standalone chatbot, RAG-backed assistant, agentic system with tool access, embedded in a customer-facing product); all data sources the model has read access to; all systems and tools the model can take actions against (write, execute, send, delete); and the business owner accountable for each deployment.
Shadow AI deployments require specific attention: employees running local models on corporate devices, connecting personal Claude or ChatGPT accounts to corporate data via browser extensions, or building unauthorised agent integrations with corporate SaaS APIs. These deployments do not appear in standard IT asset inventories and often carry the highest data exposure risk precisely because they exist outside the control environment. Building a complete AI Bill of Materials is the prerequisite for everything that follows.
Repello's AI Inventory automates this step: it discovers AI models, agents, and agentic workflows across an organisation's environment, including shadow deployments, and builds a continuously updated AI BOM with threat graph visualisation showing attack paths per asset.
Step 2: Deployment Context Analysis
For each AI asset identified in Step 1, document the deployment context in enough detail to establish maximum potential impact. Four questions:
What data can the model access by reading? This includes the direct context window (system prompt contents, retrieved documents, conversation history) and any data stores the model can query through RAG or tool calls. The answer establishes information disclosure risk.
What actions can the model take? File system operations, email sending, code execution, API calls to downstream systems, database writes. The answer establishes action-execution risk and blast radius.
What downstream systems consume the model's outputs? If the model generates content that is displayed to other users, feeds into automated workflows, or is used to make business decisions without human review, downstream impact multiplies the risk of any individual attack that corrupts the output.
Who are the adversarial actors? External users with no authentication, authenticated customers, internal employees with elevated access, third-party integrations that feed data into the model's context. Each actor class has different access to attack vectors: an unauthenticated external user can attempt direct prompt injection; an authenticated internal user may be able to poison shared knowledge bases; a compromised third-party integration can inject through tool responses.
The blast radius of a fully compromised AI agent in your deployment is your maximum potential impact. This step establishes it before any threat modelling begins.
Step 3: Threat Modelling Against MITRE ATLAS
MITRE ATLAS (Adversarial Threat Landscape for Artificial-Intelligence Systems) is the adversarial ML equivalent of MITRE ATT&CK: a structured taxonomy of over 80 adversarial techniques across 14 tactic categories, with documented real-world case studies for each technique.
For each AI asset, map relevant ATLAS techniques to your deployment context. The relevant technique set varies significantly by deployment type: a standalone chatbot API has a different ATLAS threat profile than a multi-agent orchestration system with file system access. For each relevant technique, assess two dimensions:
Likelihood: How accessible is this attack given your deployment? Does it require physical access, API access, or access to the training pipeline? Is it documented in publicly available research, or does it require novel capability development? ATLAS case studies provide real-world calibration for likelihood estimates; use them rather than gut estimates.
Impact: What is the blast radius if this technique is successfully executed against your deployment? Reference the deployment context analysis from Step 2. A successful model extraction attack against a standalone inference endpoint has a different impact than the same attack against a fine-tuned model trained on proprietary clinical data.
The output of this step is a threat matrix: ATLAS techniques mapped to your specific deployment, with likelihood and impact assessments for each. This replaces the generic CVE-based threat models that standard IT risk frameworks produce.
Step 4: OWASP LLM Top 10 Coverage Assessment
For LLM deployments specifically, the OWASP LLM Top 10 provides the most operationally specific risk taxonomy available. It functions as both a threat classification and a test checklist.
For each of the ten categories, your assessment team should be able to answer two questions: Have you tested this deployment for this vulnerability class? If yes, what did the test find? Untested categories are unknown risks and should be treated as high until evaluated.
Categories that most commonly produce findings in enterprise deployments: LLM01 (prompt injection, including indirect injection through RAG sources); LLM03 (supply chain, including model file provenance and fine-tuning data sourcing); LLM06 (excessive agency, particularly for agentic deployments with broad tool access); LLM08 (vector and embedding weaknesses, relevant for any RAG deployment).
LLM pentesting methodology covers how to test each category systematically. The OWASP assessment step identifies what needs to be tested; the pentesting methodology provides the test plan.
Step 5: Risk Scoring and Prioritisation
With the threat model and OWASP coverage assessment complete, each identified risk needs a score that reflects its actual business impact, not just its technical severity.
CVSS is not well-adapted to AI vulnerabilities. Its scoring dimensions assume deterministic exploitability and fixed attack complexity. A practical AI-specific scoring rubric uses two dimensions:
Exploitability: Does the attack require a single-turn prompt or a multi-turn manipulation sequence? Is the technique publicly documented and reproducible, or does it require novel capability development? Does it require authenticated access or work against unauthenticated endpoints? What is the attack success rate against your specific deployment configuration?
Impact: Does successful exploitation produce read-only information disclosure or action execution? Does the exposed or affected data include personal records, proprietary business data, or system credentials? Are the consequences of a successful attack reversible (an inappropriate chatbot response) or irreversible (deleted production data, exfiltrated credentials, poisoned training data that propagates forward)?
A prompt injection attack with 12% success rate that, when successful, can exfiltrate the system prompt and connected document store should score higher than a theoretical model extraction attack against a publicly documented base model. Blast radius is the right severity multiplier.
Step 6: Control Mapping and Gap Analysis
For each high-priority risk, map existing controls and identify what is missing. Three control layers that most AI risk frameworks require:
Pre-deployment validation. Is the model evaluated against the relevant attack classes before it goes into production? For a fine-tuned model, this means adversarial testing covering the ATLAS techniques and OWASP categories most relevant to its deployment context. For a model sourced from a third-party or community repository, this includes behavioural evaluation for backdoor triggers and supply chain integrity checks on model files.
Continuous red teaming. Is the deployed system tested on an ongoing basis as the model, system prompt, RAG sources, and tool integrations evolve? AI red teaming that ran before launch does not cover the risk introduced by a knowledge base update two months later. Continuous coverage is the operational requirement; periodic assessments are a floor, not a ceiling. ARTEMIS provides automated continuous red teaming across the full deployed stack, including RAG pipeline attacks, tool-call hijacking, and multi-turn manipulation sequences.
Runtime monitoring. Is active exploitation being detected and blocked? This layer catches the attack that passes pre-deployment validation and slips through the adversarial test library: novel techniques, context-specific injection paths, and behavioral drift from expected operating patterns. Most organisations will find this is the layer with the largest gap.
The gap analysis output is a control coverage map: high-priority risks on one axis, the three control layers on the other, with a documented assessment of current coverage for each intersection.
Step 7: Risk Treatment and Remediation Roadmap
For each identified gap, the team determines a treatment approach. Four options:
Accept: Document the risk with a clear rationale, assign a business owner who accepts accountability, and set a review timeline. Acceptance is a legitimate choice for low-impact risks where the control cost exceeds the expected loss. It is not a legitimate default for high-impact risks that have not been formally reviewed.
Mitigate: Implement a specific control that reduces likelihood or impact to an acceptable level. Specify the control, the expected risk reduction, the implementation owner, and the delivery date.
Transfer: Assign contractual responsibility to a vendor, obtain cyber insurance coverage for AI-related incidents, or implement compensating controls that shift the residual risk. Transfer does not eliminate the risk; it changes who bears the financial consequence if it materialises.
Avoid: Do not deploy the capability until the risk can be managed to an acceptable level. For agentic deployments with broad tool access and no runtime monitoring layer, avoidance may be the correct treatment while controls are being built.
The remediation roadmap produced at this step should be tied to business risk prioritisation, not just technical severity. It should have assigned owners, delivery dates, and re-assessment triggers: at minimum, re-assess after any significant model update, system prompt change, new tool integration, or change in the data sources the model can access.
Regulatory Frameworks for AI Risk Assessment
Framework | Primary function | Operational specificity | Compliance obligation |
|---|---|---|---|
NIST AI RMF | Governance and risk management structure | High (four functions, detailed profiles) | Voluntary in the US; referenced in federal procurement |
EU AI Act | Risk-tiered regulation | Medium (defines categories, requires documentation) | Mandatory for high-risk deployments in the EU |
OWASP LLM Top 10 | LLM-specific threat taxonomy and test checklist | Very high (ten specific risk categories, mapped techniques) | Voluntary; widely adopted as baseline |
Google SAIF | Secure AI development principles | Medium (six elements, framework-level guidance) | Voluntary |
NIST AI Risk Management Framework (AI RMF)
The NIST AI RMF is the most operationally detailed governance framework for enterprise AI risk management. It organises AI risk activities across four core functions: Govern (establishing accountability structures, roles, and policies), Map (contextualising the AI system and identifying relevant risk categories), Measure (analysing and prioritising identified risks), and Manage (implementing treatments and maintaining ongoing monitoring).
The AI RMF is a governance structure, not a technical test plan. It tells you what categories of risk to assess and how to organise accountability for that assessment. It does not specify how to test for prompt injection, what techniques to use for adversarial evaluation, or what constitutes adequate coverage of the OWASP LLM Top 10. An organisation that has completed the AI RMF governance structure but has not run adversarial testing has documented its risk management process without actually measuring its risk.
Both are necessary. The AI RMF provides the accountability structure that makes adversarial testing findings actionable. Adversarial testing provides the empirical data that makes the AI RMF's risk measurement function meaningful.
EU AI Act
The EU AI Act introduces risk-tiered regulation across four categories: unacceptable risk (banned systems, including real-time biometric surveillance and social scoring), high risk (extensive compliance obligations), limited risk (transparency requirements), and minimal risk (no specific obligations beyond general law).
High-risk categories include AI deployed in critical infrastructure, employment and workforce management, education, law enforcement, essential private and public services (including credit scoring), and AI as a safety component in regulated products. For organisations with high-risk deployments, risk assessment documentation is a legal compliance requirement. The specific documentation includes a risk management system that runs throughout the system lifecycle, conformity assessments, and post-market monitoring processes.
Practical implication: if your AI deployment falls in a high-risk category, the seven-step framework above satisfies the technical assessment requirements that feed into your EU AI Act compliance documentation. The NIST AI RMF governance structure maps well to the organisational requirements.
OWASP LLM Top 10
The OWASP LLM Top 10 is the operationally specific risk taxonomy for LLM application security. Unlike the governance-level frameworks above, it provides a direct test checklist: each of the ten categories has documented attack scenarios, technique descriptions, and mitigation guidance. The 2025/2026 edition updates prior versions with new risk categories reflecting the current deployment landscape, including vector and embedding weaknesses (LLM08) and system prompt leakage (LLM07).
Every LLM deployment risk assessment should produce a documented coverage assessment against all ten categories. The output is a clear picture of which risk classes have been tested, which have not, and what the findings were. Regulators and enterprise security reviews are increasingly asking for this documentation specifically.
Google SAIF (Secure AI Framework)
Google's Secure AI Framework organises AI security requirements across six core elements: expanding strong security foundations to the AI ecosystem; extending detection and response to bring AI into scope of existing security operations; automating defences to keep pace with threats that evolve faster than human review cycles; harmonising platform-level controls across the infrastructure running AI systems; adapting controls to address AI-specific risks (model behaviour, training pipeline, supply chain); and contextualising AI risk within surrounding business processes.
SAIF is principles-based rather than prescriptive. It is most useful as a cross-check after completing the seven-step assessment framework: do your controls satisfy each of the six elements? The most commonly missing element for organisations new to AI security is the fifth: adapting controls to address AI-specific risks as distinct from the infrastructure-level controls that most security programs already have.
Common AI Risk Assessment Mistakes
Treating AI risk as a subset of existing software risk. The methodology for assessing a deterministic web application does not transfer to a probabilistic ML system. Using CVSS to score jailbreak techniques, running dependency scanning on a model checkpoint, and classifying prompt injection as an injection vulnerability in the traditional AppSec sense all produce plausible-looking but misleading risk documentation.
Assessing the base model without testing the deployed configuration. A foundation model that passes a standard safety benchmark in isolation may behave unsafely when connected to a specific RAG knowledge base, given a particular system prompt, or granted access to a specific set of tools. The risk is in the deployment, not the model in isolation. Assessment that does not cover the full deployed stack produces findings that do not reflect actual production risk.
Running a one-time assessment without a re-assessment trigger. AI systems change continuously: model versions update, system prompts are adjusted, knowledge bases are refreshed, new tool integrations are added. Each change potentially alters the attack surface. A risk assessment completed at initial deployment expires the first time any of these change.
Scoring risk without establishing blast radius first. A low-probability prompt injection attack that, when it succeeds, can exfiltrate an entire connected document store is not a low-severity finding. Severity is probability times impact; an assessment that does not establish blast radius in Step 2 cannot produce calibrated severity scores in Step 5.
Treating compliance framework coverage as security coverage. Completing the NIST AI RMF governance documentation, passing an EU AI Act conformity assessment, and mapping controls to OWASP LLM Top 10 categories are compliance activities. They confirm that a process exists. They do not confirm that adversarial tests were run, that novel attack chains were discovered, or that the deployed system is actually resistant to the attack classes it will face in production.
Repello's AI Inventory automates Step 1, discovering every AI asset including shadow deployments, and continuously maintains the asset map as the environment evolves. ARTEMIS handles Steps 3, 4, and 6: automated adversarial evaluation against ATLAS techniques and OWASP LLM Top 10 categories, continuous red teaming against the live deployed stack, and gap identification with blast-radius-weighted prioritisation.
See how Repello's platform supports the full AI risk assessment framework.
Frequently Asked Questions
What is AI risk assessment and why is it different from standard IT risk assessment? AI risk assessment is the structured process of identifying, analysing, and prioritising risks introduced by AI systems across their full lifecycle. It differs from standard IT risk assessment on three dimensions: AI systems have probabilistic failure modes rather than binary vulnerabilities; training pipelines are attack surfaces with no equivalent in conventional IT; and AI risk is contextual, varying with what data and tools a deployment can access. A standard IT risk assessment methodology misses all three dimensions and produces risk documentation that looks credible but understates the actual attack surface.
What is the NIST AI Risk Management Framework and how do I use it? The NIST AI RMF organises AI risk activities across four functions: Govern (accountability structures), Map (context and risk identification), Measure (analysis and prioritisation), and Manage (treatment and monitoring). It is a governance framework, not a technical test plan: it tells you what categories of risk to address and how to organise accountability for managing them. To use it effectively, pair it with technical test methodologies for each risk category. The AI RMF Govern and Map functions map to Steps 1 and 2 of the framework above; Measure maps to Steps 3 through 5; Manage maps to Steps 6 and 7.
Which regulatory frameworks require AI risk assessment documentation? The EU AI Act mandates risk assessment documentation for high-risk AI deployments, including a risk management system that runs throughout the system lifecycle and conformity assessment records. High-risk categories include AI in critical infrastructure, employment, education, law enforcement, and essential services. In the US, the NIST AI RMF is voluntary but is referenced in federal procurement and is increasingly expected by enterprise customers in procurement questionnaires. Sector-specific requirements add to the baseline: HIPAA for healthcare AI, financial services regulations for AI in credit and fraud detection, and FedRAMP for AI systems processing government data.
How often should you conduct an AI risk assessment? A full AI risk assessment should run at initial deployment and then be re-triggered by any significant change: model version update, system prompt modification, new tool integration, RAG knowledge base update, change in the user population, or change in the data sources the model can access. In practice, for AI systems that change frequently, this means continuous adversarial testing rather than periodic formal assessments. A re-assessment cycle based on a calendar (quarterly, annually) rather than a change trigger will routinely miss risk introduced between assessment dates.
What is the difference between AI risk assessment and AI red teaming? AI risk assessment is the analytical process of identifying and prioritising what risks exist. AI red teaming is the adversarial testing activity that produces empirical evidence about which risks are exploitable and what their actual impact is. Risk assessment without red teaming produces a risk register based on theoretical exposure. Red teaming without a prior risk assessment may test the wrong things, miss high-priority attack surfaces, and produce findings with no risk prioritisation framework to interpret them. The seven-step framework above treats red teaming as the primary measurement mechanism in Steps 4 and 6, rather than a standalone activity.
What tools are available for AI risk assessment? The tools landscape spans three categories: asset discovery (automated identification of AI deployments including shadow AI, building the asset inventory for Step 1); adversarial testing platforms (automated red teaming against the OWASP LLM Top 10 and MITRE ATLAS technique taxonomy, providing empirical risk measurement for Steps 3 through 5); and runtime monitoring (detection and blocking of active exploitation, closing the gap identified in Step 6). Repello's AI Inventory addresses asset discovery. ARTEMIS addresses adversarial testing. Repello's ARGUS addresses runtime monitoring. Governance documentation tools (GRC platforms that support NIST AI RMF profile management) form a fourth category relevant to organisations with formal compliance documentation requirements.
Share this blog
Subscribe to our newsletter











