Why can't standard dependency scanners catch phantom dependency attacks?

Standard software composition analysis tools check installed packages against CVE databases and known-malicious package lists. A phantom dependency created on the day of the attack has no CVE entry and no threat intelligence history. It is signed by a legitimate maintainer key and installs without error. The only detection approach that works during the attack window is behavioral: instrumented sandbox execution to observe network callouts, file modifications, and persistence attempts before the dependency reaches your build pipeline.

Do these supply chain attacks only affect Python and JavaScript projects?

The LiteLLM, axios, and Mercor incidents involved PyPI and npm packages, but the methodology applies to any package ecosystem: Go modules, Ruby gems, Maven artifacts, Docker Hub images, and Hugging Face model repositories. Any dependency consumed from a public registry without behavioral verification is a potential vector. AI teams using Hugging Face for model weights face the same account-compromise risk applied to model files rather than code packages.

Account Compromise, Phantom Dependency, No Forensic Trail: The Attack Pattern Behind LiteLLM, axios, and Mercor

Q: How do we get visibility into our full AI dependency graph?

Most organizations have limited visibility into their transitive AI dependency graph — they know what they explicitly installed, not what those packages pulled in. Repello AI's AI Inventory maps the full dependency graph for your AI applications including transitive dependencies, maintains a live version state, and alerts on unexpected version changes. This gives your team a detection window before CI/CD pulls and executes a newly published malicious version.

TL;DR: Three high-profile supply chain incidents in Q1 2026 — LiteLLM, axios, and Mercor — followed the same four-step methodology: compromise a trusted maintainer account, publish a malicious package version with a phantom dependency, let automated pipelines do the distribution, then self-delete after credential harvest. Each attack targeted AI development teams specifically. The methodology is now documented enough to treat as a template. If you haven't mapped your AI pipeline's exposure to this pattern, that work is overdue.

Three incidents, one methodology#

These three incidents are typically covered as separate news stories. They are the same attack.

LiteLLM (January 2026): Attackers obtained the PyPI credentials of a TeamPCP contributor and published versions 1.82.7 and 1.82.8 with a three-stage backdoor. Stage one collected environment variables. Stage two harvested API credentials and cloud provider tokens. Stage three established persistent access. LiteLLM is a dependency in thousands of AI application stacks; a single compromised version reached a significant portion of teams building on top of OpenAI, Anthropic, and other model providers. Full breakdown: our LiteLLM supply chain attack analysis.

axios (March 31, 2026): Attackers attributed to UNC1069, a North Korean state-sponsored threat cluster, compromised the npm account of jasonsaayman, a core axios maintainer. They published axios@1.14.1 and axios@0.30.4 carrying a phantom dependency, plain-crypto-js@4.2.1, which delivered a cross-platform RAT to macOS, Windows, and Linux. Huntress telemetry recorded the first infected endpoint 89 seconds after publication. The payload self-deleted after execution and replaced the package manifest with a clean version. 135 Huntress-monitored endpoints were confirmed compromised within the 74-minute window the packages were live. Full breakdown: our axios supply chain attack analysis.

Mercor (Q1 2026): Lapsus$ claimed a 4TB breach of Mercor, an AI-powered hiring platform: 939GB of source code, 211GB of database content, and approximately 3TB of files including video interviews and KYC passport documents. The breach path ran through the LiteLLM backdoor, a direct demonstration of how a compromised AI dependency translates into a downstream organizational breach. Full breakdown: our Mercor breach analysis.

The four-step template#

Strip away the vendor names and platform specifics and the methodology is identical across all three:

Step 1: Compromise a trusted account. The entry point in every case was a legitimate maintainer or contributor account, not a vulnerability in the package itself. PyPI credentials via TeamPCP, an npm account via credential theft, a developer account with production access. The trust signal that makes open source packages usable (a known author publishing a new version) is the attack surface.

Step 2: Publish a malicious version with a phantom dependency. The malicious payload is not placed directly in the package code. It is introduced as a new dependency that did not previously exist: plain-crypto-js@4.2.1 in the axios case. This separates the malicious code from the trusted package name, reduces the signal that automated scanners key on, and allows the dropper to be hosted on attacker-controlled infrastructure.

Step 3: Let automated pipelines distribute it. CI/CD systems pull the latest matching version on every build. There is no human review step between a new package publication and deployment to a build environment. The 89-second infection window in the axios case is not exceptional; it is the expected behavior of any automated pipeline set to pull fresh dependencies.

Step 4: Harvest credentials, then clean up. After execution, the dropper deletes itself and restores a clean package manifest. The installed package looks legitimate on post-mortem inspection. The persistence mechanism (where one exists) is designed to blend in: Apple-style LaunchDaemon naming on macOS, a MicrosoftUpdate registry key on Windows. On Linux CI/CD runners, no persistence is needed. The value is in the environment variables available during the build, and those are harvested in real time before the runner terminates.

Why AI teams are the specific target#

This pattern is not incidentally hitting AI teams. It is targeting them.

The economics are straightforward. A compromised build environment at a standard software company yields cloud credentials and internal system access. A compromised build environment at an AI company yields those same credentials plus API keys for model providers: keys that can run inference at scale, be sold, and cost the victim money with every request made under the harvested credential. OpenAI API keys stolen from a build pipeline are immediately monetizable. That increases attacker return per compromise.

AI development teams also tend to operate with higher velocity and lower security friction than core infrastructure teams. The same cultural characteristics that make AI teams ship fast (rapid iteration, broad permissions for developers, new tools adopted quickly) make them a softer target in the supply chain context.

Finally, the dependency graph for AI applications is dense. LiteLLM alone is a transitive dependency in a significant portion of Python-based AI stacks. A single compromised package at that layer of the graph has enormous reach without requiring the attacker to compromise each downstream organization individually.

What your scanners are missing#

Standard software composition analysis tools operate on known-malicious signatures. They check installed packages against CVE databases and known-bad package lists. That approach has a structural gap: it cannot catch a phantom dependency that was created the same day it was deployed.

plain-crypto-js@4.2.1 did not exist before March 31, 2026. It had no CVE. It had no entry in any threat intelligence feed. A scanner checking it against known-malicious packages would return clean.

The same gap applies to any novel package introduced via maintainer account compromise. The package is signed by a legitimate maintainer key. It installs without error. It has no prior malicious history. The only detection approach that works during the attack window is behavioral: does this dependency make unexpected network connections, modify files it has no reason to touch, or write persistence mechanisms?

That requires pre-production behavioral testing, not post-install signature scanning.

The detection gap and how to close it#

Three controls map directly to the four-step attack template.

Version drift monitoring#

This closes the gap at Step 2. An inventory system that tracks the exact version state of every dependency across your AI applications will flag unexpected version changes before installation. This covers both the new malicious version and the introduction of the phantom dependency. Repello AI's AI Inventory continuously monitors dependency version state across your AI application stack and alerts on drift, giving your team a detection window before CI/CD pulls and executes the payload.

Pre-production behavioral testing#

This closes the gap at Step 3. Running new or updated dependencies in an instrumented sandbox environment before they reach your build pipeline exposes C2 beacons, credential access, file modifications, and persistence attempts. ARTEMIS includes supply chain testing within its attack simulation coverage, executing dependencies in controlled environments and surfacing behavioral anomalies that signature-based scanners will not catch.

Runtime credential monitoring#

This closes the gap at Step 4. If a malicious dependency executes in a production or staging environment, the C2 connection and credential harvest are runtime behavioral events. ARGUS monitors for anomalous outbound connections, unexpected credential access patterns, and behavioral deviations in your AI application runtime. It catches threats even after the dependency boundary has been crossed.

The three controls map to the three phases where the attack can still be stopped: before installation, before execution, and after execution.

What to expect next#

The LiteLLM, axios, and Mercor incidents are not isolated. They are early data points in a pattern that will accelerate as AI application deployment scales.

The attack template is documented, the target selection criteria are clear, and the economics favor continued investment by threat actors. Expect the same methodology applied to model serving libraries, vector database clients, LangChain integrations, and any other package with high installation counts in AI engineering environments.

Organizations that treat supply chain security as a dependency scanner checkbox will encounter this pattern again. Organizations that instrument their AI pipelines with inventory monitoring, behavioral pre-production testing, and runtime anomaly detection will catch the next iteration before it executes.

For a full framework covering the complete AI supply chain attack surface, including model weights, training data pipelines, and the MCP plugin ecosystem, see AI Supply Chain Attacks: The Complete Guide for Security Teams.

FAQ#

Are these three incidents definitively linked to the same threat actor? LiteLLM and axios share overlapping attribution signals pointing to North Korean state-sponsored clusters (UNC1069/NICKEL GLADSTONE/BlueNoroff). The Mercor breach ran through the LiteLLM backdoor, making it a downstream consequence rather than a separate intrusion. Whether a single actor orchestrated all three or the LiteLLM backdoor was exploited opportunistically by Lapsus$ remains an open attribution question. The attack methodology is consistent regardless of attribution.

How do phantom dependencies differ from dependency confusion attacks? Dependency confusion attacks exploit the package manager's resolution logic to substitute a private internal package with a public malicious one of the same name. Phantom dependencies are novel packages that did not previously exist, introduced into a legitimate package's manifest via maintainer account compromise. Both are supply chain vectors but they require different mitigations: namespace controls address dependency confusion; version drift monitoring and behavioral testing address phantom dependencies.

What should we do immediately if we ran LiteLLM 1.82.7/1.82.8 or axios 1.14.1/0.30.4? Treat the environments that ran those versions as compromised. Rotate all credentials, API keys, cloud provider tokens, and secrets that were accessible to the build process. Check for persistence indicators specific to each variant (see the individual incident analyses). Audit outbound network connections from affected machines for C2 beacons. Do not rely on a clean post-mortem scan of the package as evidence of non-compromise; the dropper self-deletes.

Do these attacks affect organizations that don't use Python or JavaScript? The LiteLLM, axios, and Mercor incidents involved PyPI and npm packages. However, the methodology (maintainer account compromise leading to a malicious version publication) applies to any package ecosystem: Go modules, Ruby gems, Maven artifacts, Docker Hub images, Hugging Face model repositories. The attack surface is any dependency you consume from a public registry without behavioral verification.

How do we get visibility into which AI dependencies are in our stack? Most organizations have poor visibility into their transitive AI dependency graph. They know what they explicitly installed, not what those packages pulled in. Repello AI's AI Inventory maps the full dependency graph for your AI applications, including transitive dependencies, and maintains a live version state that alerts on unexpected changes. Book a demo to see how it covers your specific stack.