Build · founder · 7 min read
What Anthropic Actually Shipped at Code with Claude 2026
The May 6 SF developer event delivered Claude Managed Agents, Remote Agents for Claude Code, and the clearest signal yet on where Anthropic is heading.
If you read our preview guide before the event, you were watching for a new Sonnet model, some Claude Code updates, and maybe a product direction signal. Yesterday’s event delivered on all three — plus one launch that’s more significant than its name suggests.
Here’s what you need to know, without the developer jargon.
Claude Managed Agents: the headline launch
The biggest announcement wasn’t a new model. It was Claude Managed Agents, a framework that lets teams deploy coordinated fleets of AI agents that can work together on complex, long-running tasks. Anthropic described it as “10x faster shipping” with best practices baked in.
The three capabilities that matter:
Multi-agent orchestration lets you define a task and have multiple Claude agents split the work, coordinate with each other, and pool results. Think of it as the difference between asking one person to build a feature and assigning it to a small team that divides and conquers.
Outcomes is a new primitive that lets you specify success criteria rather than just instructions. Instead of telling Claude how to do something, you tell it what done looks like — and the agent iterates until it gets there. This is a meaningful shift for anyone who’s spent time in loop-prompt-review-loop cycles. You set the target once and check back when it’s there.
Dreaming is the most experimental of the three. It lets Claude agents inspect logs from their previous work sessions, identify patterns in where they got stuck or made mistakes, and adjust their approach accordingly. In practice it means an agent that ran badly on a task last Tuesday will run somewhat differently on a similar task next Tuesday. Whether “somewhat” becomes “meaningfully” over time is the open question.
For non-technical founders: Managed Agents aren’t directly available inside Lovable or Bolt yet. But the tools you use are built on Claude, and Anthropic’s frameworks tend to get absorbed into the ecosystem within a few months. The “Outcomes” capability in particular is likely to show up as a feature in consumer-facing tools before end of year.
Claude Code updates: Code Review and Remote Agents
Claude Code got two additions worth knowing about.
Code Review is now available as a built-in command inside Claude Code. Anthropic says it’s used by every team internally. In practice it means you can hand Claude Code a pull request diff and get a structured review — not just “looks fine” but issues flagged by type (logic errors, security concerns, test coverage gaps). Third-party developers who’ve tested it say the quality is similar to asking a competent senior developer to do a quick pass, not a thorough audit. Useful for fast-moving teams; not a substitute for real security review on anything customer-facing.
Remote Agents does something genuinely useful: it lets you control an active Claude Code session on your laptop from your phone. If you’re the kind of founder who starts a build on your computer and wants to check in on it from your phone without interrupting the agent’s flow, this is solving a real annoyance. The use case sounds minor but in practice it’s the difference between “I have to be at my desk to manage this” and “I can kick off a build, step away, and steer it from anywhere.”
The Sonnet 4.8 situation
Anthropic did not formally announce Sonnet 4.8 at the event — but they also didn’t need to. Internal source code had already circulated online in the days before the event, and the details are credible enough to be worth knowing. The key improvements in the next Sonnet model:
- Vision accuracy approaching 98% — a meaningful jump from Sonnet 4.6, which means better performance on tasks involving screenshots, design files, and UI interpretation
- +12 points on coding benchmarks — consistent with what Anthropic typically ships between Sonnet versions
- X-high effort level — a new compute tier that lets the model spend more “thinking time” on harder tasks, at higher cost
For the tools you use: if Sonnet 4.8 lands this week or next, Lovable, Bolt, and Replit Agent will pick it up automatically (they route to the latest Sonnet without requiring user action). Cursor users will see a new option in the model picker. Claude Code users can set the model via the --model flag or in settings.
Don’t obsess over the benchmarks. What matters is whether the tool produces better output on your tasks. Give it a week after it launches, then try the same prompt you used last week and see if the result is different.
The advisor strategy: Opus coaching Sonnet
One technical detail worth filing away: Anthropic showed benchmarks demonstrating that routing a Sonnet agent to consult Opus on difficult decisions — rather than running Opus for the whole task — produces better results than Sonnet alone at lower cost than Opus alone. They’re calling this the “advisor strategy.”
In plain terms: the smart, expensive model acts as a consultant. The fast, cheaper model does the work. The consultant weighs in when the worker gets stuck. That pattern is now officially supported in Claude’s API, which means tool developers can implement it natively. Expect to see it show up in Lovable, Cursor, and others as a “quality mode” or similar over the coming months.
What this means for how you build
The overall direction from yesterday’s event is clear: Anthropic is building toward orchestrated, long-running, autonomous agents that can handle real engineering work while a human sets objectives and checks results. That’s a different bet from what Cursor is making (code editor with powerful AI assistance) and from what Lovable is making (accessible MVP builder with managed infrastructure).
For non-technical founders, none of this changes what you should do today. Lovable and Bolt are still the fastest way to a working product. The Claude layer they sit on is getting better with each model release, and you benefit from that automatically. What does change is the medium-term trajectory: within 12-18 months, the gap between “vibe coding” and “autonomous software engineering” is going to shrink considerably. The tools will do more with less supervision. That’s worth knowing when you think about how much to invest in learning the current generation of workflows.
The one thing worth following up on
Claude Managed Agents is available in the API today, which means any developer building with Claude can access it now. If you have a technical co-founder or developer on your team, point them at the Anthropic documentation and ask whether Managed Agents could replace any of the repetitive agentic workflows you’ve been building manually. The Outcomes feature in particular is designed for exactly the kind of “keep trying until this acceptance criterion passes” loop that gets tedious to manage manually.
For the rest of you: watch what Lovable and Bolt ship over the next few months. The features from yesterday’s event will show up there, with consumer-friendly names and no API required.
Related guides
founder · 5 min read
NewCode with Claude on May 6: What Non-Technical Founders Should Watch For
Anthropic's San Francisco developer event on May 6 is likely to ship a new Sonnet model. Here's what that probably means for the tools you actually use.
founder · 7 min read
NewSpaceX Buys a $60B Option on Cursor: What It Means for Founders Picking a Coding Tool
SpaceX signed a deal on April 21 giving it the right to buy Cursor for $60B. Here's what the structure is and how to think about it as a founder or PM.
founder · 8 min read
NewGoogle AI Studio Just Became a Real Vibe Coding Tool: What It Means for Founders
Google launched a free full-stack vibe coding experience in AI Studio on April 27. Here's what changed and how to think about it as a non-technical builder.
Enjoying this guide?
Get weekly practical guides, plus tool updates and implementation playbooks.