Back to all blogs

MCP vs CLI: What Perplexity's Move Actually Means for AI Security Teams

MCP vs CLI: What Perplexity's Move Actually Means for AI Security Teams

Archisman Pal

Archisman Pal

|

Head of GTM

Head of GTM

|

6 min read

MCP vs CLI: What Perplexity's Move Actually Means for AI Security Teams
Repello tech background with grid pattern symbolizing AI security


TL;DR: Perplexity co-founder and CTO Denis Yarats announced at the Ask 2026 developer conference that the company is moving away from MCP internally, citing a 72% context window consumption problem and authentication friction. Cloudflare reached the same conclusion and built a code-generation alternative. The engineering case against MCP as a tool-calling protocol is real. But the security debate that followed largely misses the point: switching from MCP to CLI does not eliminate prompt injection, does not fix privilege escalation risks, and does not reduce your attack surface. It changes the protocol; it does not change the threat model.

What Perplexity's CTO actually said

At Perplexity's Ask 2026 developer conference in March 2026, Denis Yarats announced that Perplexity is moving away from MCP for its internal and enterprise-facing systems, replacing it with direct REST API calls and command-line interfaces. His stated reasons were specific and technical: MCP's tool schema definitions consume context window space before the agent processes a single user message, authentication across multiple MCP servers introduces friction that degrades reliability at scale, and most MCP features are simply never used in production deployments.

The 72% figure that circulated following the announcement refers to context window overhead: in a typical MCP-heavy deployment, tool schemas and protocol overhead consume up to 72% of available context before any user intent is processed. For a production system handling thousands of sessions, that is both a cost problem and a capability problem: the model has less context available for reasoning about the actual task.

Cloudflare independently arrived at a similar conclusion. Their Code Mode blog post describes replacing MCP's tool-calling mechanism with code generation: rather than exposing 2,500 API endpoints as individual MCP tools (which would require roughly 244,000 tokens to describe), they surface the same functionality using approximately 1,000 tokens by having the model write code against an already-authorized API client. The token reduction is 244x. That is not a marginal engineering optimization; it is a fundamental limitation of MCP's tool schema design at scale.

Y Combinator CEO Garry Tan made a similar move independently, building a CLI-based agent integration rather than using MCP, citing reliability and speed as the primary drivers.

The engineering case against MCP as a universal tool protocol

The criticism from Perplexity and Cloudflare is not that MCP is a bad idea. It is that MCP is solving a specific problem (structured tool discovery and invocation) with a design that creates significant overhead at the scale and complexity of production agent deployments.

MCP's tool schema design was built for discoverability: the protocol lets an agent enumerate available tools, understand their parameters, and call them dynamically. That design is genuinely useful for local development tooling, IDE integrations, and cases where an agent needs to discover capabilities it did not know about at design time. Claude Desktop's MCP integration is a good example of the right use case: a user's local environment has tools the agent should be able to discover and use.

For production API integrations where the available tools are known, fixed, and controlled by the operator, MCP's discoverability overhead is pure cost. A direct API call or a CLI command is faster, cheaper, uses less context, and is easier to authenticate and audit. Yarats' point about most MCP features going unused in production is consistent with this: discoverability is a development-time feature, not a runtime requirement.

So the engineering conclusion is nuanced: MCP is the right tool for some use cases (dynamic tool discovery, local tooling, developer environments) and the wrong tool for others (production API integration at scale, systems with fixed tool sets, latency-sensitive deployments).

What the CLI migration does not fix: the security argument

The security framing of the MCP vs CLI debate has largely focused on MCP's attack surface. That framing is partially correct but incomplete in a way that matters for enterprise security teams.

MCP does introduce specific security risks. Repello's research on MCP tool poisoning to RCE demonstrated a complete exploitation chain from a malicious tool definition to remote code execution. The MCP prompt injection analysis documented how adversarial instructions embedded in tool responses hijack agent behavior. Repello's MCP security checklist covers the 12 controls required before any MCP deployment goes to production. These risks are real.

But none of them are protocol-specific in the way the CLI migration narrative implies.

Prompt injection through tool responses exists in CLI-based systems. The injection surface is not the JSON-RPC transport layer. It is the content returned by whatever tool the agent calls. A CLI command that fetches a web page, queries a database, or reads a file returns attacker-influenced content to the model's context window regardless of whether that content arrived via MCP or a subprocess call. The attack: plant adversarial instructions in any content the tool retrieves. The delivery mechanism is irrelevant.

Privilege escalation follows tool permissions, not protocol design. An agent with a CLI tool that can write files, send network requests, or execute shell commands has exactly the same blast-radius problem as an MCP-connected agent with equivalent permissions. The threat model that Cloudflare described as "giving a junior hire a master key to every office on their first day" applies equally to a CLI agent with broad subprocess permissions. Least privilege, sandbox isolation, and tool access controls are architectural requirements that no protocol choice eliminates.

Authentication complexity shifts, it does not disappear. Yarats identified authentication friction as a key reason to move away from MCP. That friction is real: managing credentials across multiple MCP servers is harder than a single API key. But CLI-based agents that call multiple external services have their own credential management surface. The credential is stored differently and accessed differently; the risk of credential exposure through prompt injection or misconfiguration is structurally the same.

"The question is not which protocol the tool call travels over," says the Repello AI Research Team. "The question is whether the content that comes back from the tool gets inspected before it reaches the model. That question has the same answer regardless of whether you're running MCP, CLI, or direct REST."

Is this the end of MCP?

No. But it is a meaningful correction of the narrative that MCP is the universal standard for AI agent tool integration.

MCP will likely consolidate around the use cases it was designed for: local tool discovery, IDE integrations, developer environments, and cases where dynamic capability enumeration adds genuine value. For these use cases, the context overhead is acceptable and the discoverability benefit is real. The continued support for MCP in Claude Desktop, Cursor, and VS Code reflects this.

For production-scale enterprise deployments with fixed tool sets, known APIs, and throughput requirements, direct API integration or CLI-based approaches will continue to be the practical choice. Perplexity's Agent API, which replaced their MCP-heavy internal architecture, is the production pattern: a single authenticated endpoint, controlled tool execution, model selection on the caller's side.

Cloudflare's code-generation approach represents a third path: keep MCP for discovery and connection management, but replace its tool-calling mechanism with generated code that executes against a pre-authorized client. This gets the token efficiency of direct API calls while retaining MCP's tool discovery model. Their conclusion that this also improves security by preventing the model from handling raw API credentials directly is worth noting: the pre-authorized client pattern means injection into the model's context cannot yield credential extraction.

What this means for enterprise security teams

For security engineers evaluating agentic AI deployments, the MCP vs CLI debate should change one thing: the scope of your attack surface assessment.

If your team built its threat model around MCP-specific risks (tool poisoning via malicious server definitions, JSON-RPC manipulation, schema injection), that model is incomplete if the organization also runs CLI-based agents. The injection, privilege escalation, and audit logging requirements apply to all agentic tool integrations regardless of transport. The NIST AI Risk Management Framework (AI RMF 1.0) treats agentic tool access as a continuous monitoring requirement, not a protocol-level configuration.

The practical implication: your red team test plan needs to cover tool response injection across every tool integration path, not just MCP servers. A CLI subprocess that returns web-scraped content is an injection surface. A REST API response that includes attacker-controlled strings is an injection surface. Protocol diversity increases the surface you need to cover, not the surface you need to worry about per protocol.

ARGUS enforces content inspection at the model context layer rather than the protocol layer, which means its coverage does not change when the underlying tool transport changes from MCP to CLI or REST. ARTEMIS runs injection probes across all tool integration paths in the test plan, flagging which response channels have been tested and which remain uncovered regardless of their transport mechanism.

The protocol debate will continue. The attack surface does not care which side wins.

Frequently asked questions

Why is Perplexity moving away from MCP?

Perplexity co-founder and CTO Denis Yarats announced at the Ask 2026 developer conference that MCP's tool schema overhead consumes up to 72% of available context window space before the agent processes any user input. Combined with authentication complexity across multiple MCP servers and the fact that most MCP features go unused in production, the protocol adds cost and latency without proportionate value for Perplexity's use case. Their replacement is a direct Agent API with a single authenticated endpoint and internal tool execution.

Is MCP dead?

No. MCP remains well-suited for its original design purpose: dynamic tool discovery in local environments, IDE integrations, and developer tooling where capability enumeration adds genuine value. The context overhead that makes MCP impractical for production-scale deployments with fixed tool sets is acceptable in these use cases. What is ending is the assumption that MCP is the universal standard for all AI agent tool integration.

Is CLI more secure than MCP?

Not inherently. The primary security risks in agentic tool use (prompt injection through tool responses, privilege escalation through overpermissioned tools, and insufficient audit logging) apply equally to CLI-based and MCP-based integrations. Both deliver attacker-influenced content to the model's context window; both require least-privilege tool design, input/output inspection, and sandbox isolation. The transport protocol does not determine the security posture.

What is Cloudflare's Code Mode and how does it relate to MCP?

Cloudflare's Code Mode replaces MCP's tool-calling mechanism with code generation: rather than exposing API endpoints as individually described MCP tools (requiring up to 244,000 tokens for 2,500 endpoints), the model writes code against a pre-authorized API client, covering the same functionality in approximately 1,000 tokens. The pre-authorized client pattern also prevents the model from handling raw API credentials directly, reducing credential exposure risk from prompt injection.

Does switching from MCP to CLI reduce prompt injection risk?

No. Prompt injection through tool responses is determined by whether attacker-influenced content can enter the model's context window through any tool output, not by which protocol delivers that content. A CLI tool that fetches a web page, reads a file, or queries an external API returns content that can contain adversarial instructions regardless of transport. Effective defense requires inspecting all tool response content before it reaches the model, applied equally across MCP, CLI, and REST integrations.

What should enterprise security teams do in response to this debate?

Extend your threat model to cover all tool integration paths, not just MCP servers. If your organization runs both MCP-based and CLI-based agents, your red team test plan and runtime monitoring should cover tool response injection across all channels. Protocol diversity does not reduce your attack surface; it increases the surface you need to explicitly test and monitor.


TL;DR: Perplexity co-founder and CTO Denis Yarats announced at the Ask 2026 developer conference that the company is moving away from MCP internally, citing a 72% context window consumption problem and authentication friction. Cloudflare reached the same conclusion and built a code-generation alternative. The engineering case against MCP as a tool-calling protocol is real. But the security debate that followed largely misses the point: switching from MCP to CLI does not eliminate prompt injection, does not fix privilege escalation risks, and does not reduce your attack surface. It changes the protocol; it does not change the threat model.

What Perplexity's CTO actually said

At Perplexity's Ask 2026 developer conference in March 2026, Denis Yarats announced that Perplexity is moving away from MCP for its internal and enterprise-facing systems, replacing it with direct REST API calls and command-line interfaces. His stated reasons were specific and technical: MCP's tool schema definitions consume context window space before the agent processes a single user message, authentication across multiple MCP servers introduces friction that degrades reliability at scale, and most MCP features are simply never used in production deployments.

The 72% figure that circulated following the announcement refers to context window overhead: in a typical MCP-heavy deployment, tool schemas and protocol overhead consume up to 72% of available context before any user intent is processed. For a production system handling thousands of sessions, that is both a cost problem and a capability problem: the model has less context available for reasoning about the actual task.

Cloudflare independently arrived at a similar conclusion. Their Code Mode blog post describes replacing MCP's tool-calling mechanism with code generation: rather than exposing 2,500 API endpoints as individual MCP tools (which would require roughly 244,000 tokens to describe), they surface the same functionality using approximately 1,000 tokens by having the model write code against an already-authorized API client. The token reduction is 244x. That is not a marginal engineering optimization; it is a fundamental limitation of MCP's tool schema design at scale.

Y Combinator CEO Garry Tan made a similar move independently, building a CLI-based agent integration rather than using MCP, citing reliability and speed as the primary drivers.

The engineering case against MCP as a universal tool protocol

The criticism from Perplexity and Cloudflare is not that MCP is a bad idea. It is that MCP is solving a specific problem (structured tool discovery and invocation) with a design that creates significant overhead at the scale and complexity of production agent deployments.

MCP's tool schema design was built for discoverability: the protocol lets an agent enumerate available tools, understand their parameters, and call them dynamically. That design is genuinely useful for local development tooling, IDE integrations, and cases where an agent needs to discover capabilities it did not know about at design time. Claude Desktop's MCP integration is a good example of the right use case: a user's local environment has tools the agent should be able to discover and use.

For production API integrations where the available tools are known, fixed, and controlled by the operator, MCP's discoverability overhead is pure cost. A direct API call or a CLI command is faster, cheaper, uses less context, and is easier to authenticate and audit. Yarats' point about most MCP features going unused in production is consistent with this: discoverability is a development-time feature, not a runtime requirement.

So the engineering conclusion is nuanced: MCP is the right tool for some use cases (dynamic tool discovery, local tooling, developer environments) and the wrong tool for others (production API integration at scale, systems with fixed tool sets, latency-sensitive deployments).

What the CLI migration does not fix: the security argument

The security framing of the MCP vs CLI debate has largely focused on MCP's attack surface. That framing is partially correct but incomplete in a way that matters for enterprise security teams.

MCP does introduce specific security risks. Repello's research on MCP tool poisoning to RCE demonstrated a complete exploitation chain from a malicious tool definition to remote code execution. The MCP prompt injection analysis documented how adversarial instructions embedded in tool responses hijack agent behavior. Repello's MCP security checklist covers the 12 controls required before any MCP deployment goes to production. These risks are real.

But none of them are protocol-specific in the way the CLI migration narrative implies.

Prompt injection through tool responses exists in CLI-based systems. The injection surface is not the JSON-RPC transport layer. It is the content returned by whatever tool the agent calls. A CLI command that fetches a web page, queries a database, or reads a file returns attacker-influenced content to the model's context window regardless of whether that content arrived via MCP or a subprocess call. The attack: plant adversarial instructions in any content the tool retrieves. The delivery mechanism is irrelevant.

Privilege escalation follows tool permissions, not protocol design. An agent with a CLI tool that can write files, send network requests, or execute shell commands has exactly the same blast-radius problem as an MCP-connected agent with equivalent permissions. The threat model that Cloudflare described as "giving a junior hire a master key to every office on their first day" applies equally to a CLI agent with broad subprocess permissions. Least privilege, sandbox isolation, and tool access controls are architectural requirements that no protocol choice eliminates.

Authentication complexity shifts, it does not disappear. Yarats identified authentication friction as a key reason to move away from MCP. That friction is real: managing credentials across multiple MCP servers is harder than a single API key. But CLI-based agents that call multiple external services have their own credential management surface. The credential is stored differently and accessed differently; the risk of credential exposure through prompt injection or misconfiguration is structurally the same.

"The question is not which protocol the tool call travels over," says the Repello AI Research Team. "The question is whether the content that comes back from the tool gets inspected before it reaches the model. That question has the same answer regardless of whether you're running MCP, CLI, or direct REST."

Is this the end of MCP?

No. But it is a meaningful correction of the narrative that MCP is the universal standard for AI agent tool integration.

MCP will likely consolidate around the use cases it was designed for: local tool discovery, IDE integrations, developer environments, and cases where dynamic capability enumeration adds genuine value. For these use cases, the context overhead is acceptable and the discoverability benefit is real. The continued support for MCP in Claude Desktop, Cursor, and VS Code reflects this.

For production-scale enterprise deployments with fixed tool sets, known APIs, and throughput requirements, direct API integration or CLI-based approaches will continue to be the practical choice. Perplexity's Agent API, which replaced their MCP-heavy internal architecture, is the production pattern: a single authenticated endpoint, controlled tool execution, model selection on the caller's side.

Cloudflare's code-generation approach represents a third path: keep MCP for discovery and connection management, but replace its tool-calling mechanism with generated code that executes against a pre-authorized client. This gets the token efficiency of direct API calls while retaining MCP's tool discovery model. Their conclusion that this also improves security by preventing the model from handling raw API credentials directly is worth noting: the pre-authorized client pattern means injection into the model's context cannot yield credential extraction.

What this means for enterprise security teams

For security engineers evaluating agentic AI deployments, the MCP vs CLI debate should change one thing: the scope of your attack surface assessment.

If your team built its threat model around MCP-specific risks (tool poisoning via malicious server definitions, JSON-RPC manipulation, schema injection), that model is incomplete if the organization also runs CLI-based agents. The injection, privilege escalation, and audit logging requirements apply to all agentic tool integrations regardless of transport. The NIST AI Risk Management Framework (AI RMF 1.0) treats agentic tool access as a continuous monitoring requirement, not a protocol-level configuration.

The practical implication: your red team test plan needs to cover tool response injection across every tool integration path, not just MCP servers. A CLI subprocess that returns web-scraped content is an injection surface. A REST API response that includes attacker-controlled strings is an injection surface. Protocol diversity increases the surface you need to cover, not the surface you need to worry about per protocol.

ARGUS enforces content inspection at the model context layer rather than the protocol layer, which means its coverage does not change when the underlying tool transport changes from MCP to CLI or REST. ARTEMIS runs injection probes across all tool integration paths in the test plan, flagging which response channels have been tested and which remain uncovered regardless of their transport mechanism.

The protocol debate will continue. The attack surface does not care which side wins.

Frequently asked questions

Why is Perplexity moving away from MCP?

Perplexity co-founder and CTO Denis Yarats announced at the Ask 2026 developer conference that MCP's tool schema overhead consumes up to 72% of available context window space before the agent processes any user input. Combined with authentication complexity across multiple MCP servers and the fact that most MCP features go unused in production, the protocol adds cost and latency without proportionate value for Perplexity's use case. Their replacement is a direct Agent API with a single authenticated endpoint and internal tool execution.

Is MCP dead?

No. MCP remains well-suited for its original design purpose: dynamic tool discovery in local environments, IDE integrations, and developer tooling where capability enumeration adds genuine value. The context overhead that makes MCP impractical for production-scale deployments with fixed tool sets is acceptable in these use cases. What is ending is the assumption that MCP is the universal standard for all AI agent tool integration.

Is CLI more secure than MCP?

Not inherently. The primary security risks in agentic tool use (prompt injection through tool responses, privilege escalation through overpermissioned tools, and insufficient audit logging) apply equally to CLI-based and MCP-based integrations. Both deliver attacker-influenced content to the model's context window; both require least-privilege tool design, input/output inspection, and sandbox isolation. The transport protocol does not determine the security posture.

What is Cloudflare's Code Mode and how does it relate to MCP?

Cloudflare's Code Mode replaces MCP's tool-calling mechanism with code generation: rather than exposing API endpoints as individually described MCP tools (requiring up to 244,000 tokens for 2,500 endpoints), the model writes code against a pre-authorized API client, covering the same functionality in approximately 1,000 tokens. The pre-authorized client pattern also prevents the model from handling raw API credentials directly, reducing credential exposure risk from prompt injection.

Does switching from MCP to CLI reduce prompt injection risk?

No. Prompt injection through tool responses is determined by whether attacker-influenced content can enter the model's context window through any tool output, not by which protocol delivers that content. A CLI tool that fetches a web page, reads a file, or queries an external API returns content that can contain adversarial instructions regardless of transport. Effective defense requires inspecting all tool response content before it reaches the model, applied equally across MCP, CLI, and REST integrations.

What should enterprise security teams do in response to this debate?

Extend your threat model to cover all tool integration paths, not just MCP servers. If your organization runs both MCP-based and CLI-based agents, your red team test plan and runtime monitoring should cover tool response injection across all channels. Protocol diversity does not reduce your attack surface; it increases the surface you need to explicitly test and monitor.

Share this blog

Subscribe to our newsletter

Repello tech background with grid pattern symbolizing AI security
Repello tech background with grid pattern symbolizing AI security
Repello AI logo - Footer

Sign up for Repello updates
Subscribe to our newsletter to receive the latest insights on AI security, red teaming research, and product updates in your inbox.

Subscribe to our newsletter

8 The Green, Ste A
Dover, DE 19901, United States of America

Follow us on:

LinkedIn icon
X icon, Twitter icon
Github icon
Youtube icon

© Repello Inc. All rights reserved.

Repello tech background with grid pattern symbolizing AI security
Repello AI logo - Footer

Sign up for Repello updates
Subscribe to our newsletter to receive the latest insights on AI security, red teaming research, and product updates in your inbox.

Subscribe to our newsletter

8 The Green, Ste A
Dover, DE 19901, United States of America

Follow us on:

LinkedIn icon
X icon, Twitter icon
Github icon
Youtube icon

© Repello Inc. All rights reserved.