Two 'Managed Agents,' same name, opposite trust models — and coding cost just dropped by 10x, The Lab

Google and Anthropic shipped identical product category names on the same day with opposite infrastructure postures. Cursor Composer 2.5 landed at one-tenth the frontier price. The Big Four accounting firms completed their AI stack alignment. Three signals from the most consequential week in agent engineering so far.

> ../signals/2026-05-23.md

—— Signal one — Same name, opposite trust posture ——

On May 19, both Google and Anthropic shipped what each called "Managed Agents." The product category name is identical. The architecture is not.

Google's Managed Agents provision a remote Linux execution environment in a single API call. The agent loop and all tool execution run on Google infrastructure. You configure via AGENTS.md and SKILL.md. Default agent ID: antigravity-preview-05-2026. One call, fully hosted, zero footprint on your side.

Anthropic's self-hosted sandboxes run the agent loop on Anthropic infrastructure but route tool execution through the customer's own perimeter — Cloudflare, Daytona, Modal, or Vercel microVMs. The tool execution stays inside your network. The model is outside it.

This is the most important and least-covered story of the week. Both approaches work. But the choice between them is an architectural decision about where your trust boundary sits: at the model layer, or at the tool layer.

Production teams that haven't made this decision explicitly have made it implicitly. Make it explicit.

—— Signal two — Coding cost dropped by 10x. The budget math changes ——

Cursor shipped Composer 2.5 on May 18 at $0.50 input / $2.50 output per million tokens. Opus 4.7 is $5.00 input. GPT-5.5 is $5.00 input. Composer 2.5 scores 79.8% on SWE-Bench Multilingual — 0.7 points behind Opus 4.7's 80.5% on the same benchmark.

A model that scores within 1% of the frontier on coding benchmarks and costs one-tenth as much is not a curiosity. It is the new baseline for agent coding budgets.

The practical implication: if you are sizing an agentic coding system today and benchmarking cost at frontier model prices, your projections are an order of magnitude too conservative. The cost compression ladder from $0.50 to $5.00 across six current models is the canonical reference for the rest of Q2 2026.

—— Signal three — The Big Four alignment is complete ——

EY and Microsoft announced a $1 billion global AI initiative on May 21. That completes the picture: PwC, KPMG, and Deloitte are aligned with Claude (via direct Anthropic deals and Microsoft/Anthropic commercial agreements). EY aligned with Microsoft Copilot directly.

Every one of the four largest professional services firms is now committed to a specific AI stack.

This is not an opinion signal. It is an infrastructure signal. When the firms that audit, consult, and advise the Fortune 500 pick a stack, their enterprise clients follow — not immediately, but inevitably. The procurement question for enterprise AI shifted this week from "which stack should we evaluate?" to "which stack did our advisors already standardize on?"

—— What to do with this ——

Three things, in order:

◆ One — Decide your trust boundary. Hosted agent loop (Google posture) or hosted model, customer-perimeter tools (Anthropic posture)? Both are valid. Neither is the default answer. Write the decision down.

◆ Two — Reprice your agentic coding budget. Composer 2.5 is not the last compression event. If your system prompt, tool contracts, and agent context are portable, you can switch model backends as pricing drops. Make portability a first-class requirement now.

◆ Three — Watch the Big Four stack for signal, not advice. When PwC builds a workflow on Claude and EY builds the same workflow on Copilot, the resulting comparison data will be the most rigorous real-world agent benchmark the industry has produced. It will surface in audit reports, case studies, and procurement templates over the next 18 months. Pay attention.

—— End of signal ——

◆ The trust boundary decision is architecture, not procurement.

◆ Cost compression is structural, not promotional. Plan for it.

The Big Four alignment is an infrastructure signal. Treat it as one.

— ORBIRESEARCH