The MCP attack surface gets measured, the agent deployment gap splits in two, and buyers move to model portfolios, The Lab

Three signals from this week. New scans put hard numbers on the MCP exposure problem for the first time. Fresh deployment data shows coding agents shipping while horizontal agents stall. And the model conversation shifts from one best model to a portfolio matched to the job.

> ../signals/2026-06-06.md

── Signal one · MCP exposure stops being a theory ──

For a year the MCP security conversation ran on anecdotes and proof of concept exploits. This week it ran on measurements. Internet scans now count more than twelve thousand publicly reachable MCP services, and a separate study of remote servers found that roughly four in ten expose their tools with no authentication at all. On the vulnerability side, an automated sweep of around forty thousand server repositories produced dozens of fresh CVEs, and a security vendor disclosed multiple flaws in database-backed MCP servers, at least one of which shipped without a patch.

The number that should move you is the authentication one. An unauthenticated remote MCP server is not a future risk. It is an open tool endpoint that any agent, or any attacker pointing an agent at it, can call today. This pairs with the NSA design guidance from late May, which treats MCP as flexible and underspecified, with security left to implementers, much like the early web protocols.

The practical implication: you can no longer treat the MCP servers your agent boots with as trusted infrastructure. Their tool definitions enter your agent's context with instruction-level authority before any human reviews them. Audit which servers your agents load, require authentication on every remote one, and pin versions against a CVE feed.

── Signal two · The deployment gap splits in two ──

Two data releases landed close together and, read side by side, they explain a contradiction teams keep running into. One industry hype cycle puts agentic AI deployment in the mid teens as a percentage of organizations, with a strong majority intending to deploy within two years. A separate state-of-agents report puts coding agents at the opposite end: the large majority of organizations have moved coding agents past experimentation and into production code, with enterprises leading.

So which is it, mid teens or large majority? Both, and the gap between them is the story. Vertical, bounded agents with a clear input and a measurable output are shipping. Coding is the clearest case: the task is well defined, the output is testable, the failure is visible and cheap to catch. Horizontal "do business operations" agents are the ones stuck in the mid teens, because the workflow was never mapped, the success metric was never defined, and the failure mode is silent.

The practical implication: if your agent program is stalled, the bottleneck is almost never the model. It is that you pointed an agent at an undefined workflow. The teams shipping picked a bounded task with a testable output first.

── Signal three · From one best model to a portfolio ──

The model conversation shifted this week from "which model is best" to "which model for which job." Frontier general models, agentic-tuned models, dedicated coding models, and low-cost high-volume models are now distinct categories, and serious teams are running a portfolio rather than a single default. Buyers increasingly judge models on repository-level coding, tool use, long-context retrieval, and speed per dollar, not on a single headline benchmark.

The practical implication for production: stop standardizing on one model for everything. Use a strong reasoning model where a wrong action is expensive, a coding-tuned model in the build loop, and a cheap fast model for high-volume low-stakes output such as routine notifications. Route by task class. The cost difference between the right and wrong model for a job is now large enough to show up in your monthly bill.

── What to do with this ──

◆ Signal I: List every MCP server your agents load. Remove the ones you cannot account for, require auth on every remote server, and subscribe to a CVE feed for each dependency. Treat tool definitions as untrusted input.

◆ Signal II: If a program is stalled, audit the workflow before the model. Pick one bounded task with a testable output and ship that first.

◆ Signal III: Build a model routing table by task class. The right model per job, not one default model everywhere.

── End of signal ──

◆ MCP exposure is now measured, not theoretical. Roughly four in ten remote servers have no authentication.

◆ The deployment gap is a workflow gap. Bounded agents ship. Undefined ones stall.

◆ The era of one best model is over. Route by task class.

ORBIRESEARCH