Scale · founder · 8 min read
GPT-5.5 is here: what founders and PMs actually need to know
OpenAI's April 23 GPT-5.5 launch changes the model stack behind every major vibe coding tool. Here's the practical read for non-technical builders.
OpenAI shipped GPT-5.5 today — April 23, 2026 — and within three hours the timeline was full of benchmark screenshots, API price complaints, and “Claude is cooked” / “GPT is cooked” takes in roughly equal measure. If you’re a founder or PM who builds with vibe coding tools but doesn’t spend your day reading model cards, you don’t need all of that. You need four things: what actually changed, how it shows up in the tools you use, what it changes about your tool stack, and what to do this week.
This is a fast read. No benchmarks theater. Just what’s useful.
What actually changed
GPT-5.5 is the first fully retrained base model from OpenAI since GPT-4.5. That phrase does a lot of work. The intermediate releases — 5.1, 5.2, 5.3, 5.4 — were tuned versions of the same underlying model. 5.5 is a new one, trained from scratch on new infrastructure, and it scores materially higher on the benchmarks OpenAI cares about.
Two numbers are worth pulling out. On Terminal-Bench 2.0 — which tests whether a model can drive a command line, plan multi-step tasks, and recover from errors — GPT-5.5 hits 82.7%. That’s state-of-the-art, and it beats Claude Opus 4.7 (Anthropic’s current flagship) at 69.4%. On SWE-Bench Pro — which tests whether a model can resolve real GitHub issues in real repos — GPT-5.5 scores 58.6%. That’s behind Claude Opus 4.7 at 64.3%.
Read those two numbers together and the story is: OpenAI now has the lead on agentic workflows, Anthropic still has the lead on writing real code inside real codebases. That’s the split that matters, and it’s likely to hold until the next Anthropic release.
The other practical change is Codex. GPT-5.5 inside Codex can now drive a browser — click through pages, run web apps, capture screenshots, test flows — in addition to the local computer use that shipped on April 16. OpenAI is packaging Codex as a general agent that happens to know how to code, not a coding agent that happens to use a computer. If you were paying attention to the trajectory of these tools, this was the inevitable next step.
How it shows up in the tools you’re using
If you use ChatGPT on Plus, Pro, Business, or Enterprise, GPT-5.5 is in your model picker starting today. If you’re on Pro ($200/mo), you also get GPT-5.5 Pro. Free and Go tiers are not getting 5.5 — they’re still on 5.4 (Plus and Go may flip over once rollout completes).
If you use Codex (the desktop or CLI product), GPT-5.5 is the new default for most tasks. There’s a “Fast” mode that runs 1.5x faster for 2.5x the cost. Paid Codex plans through May 31 get 2x usage as a rollout bonus — 10x usage on Pro instead of the standard 5x.
If you use Cursor, Windsurf, Claude Code, or Copilot, GPT-5.5 will be selectable as a model within days, if it isn’t already. None of these tools have their own foundation models (with the exception of Cursor’s Composer 2). They all resell access to frontier models from OpenAI, Anthropic, and increasingly Google. When OpenAI ships a new flagship, every tool picks it up — sometimes same-day, sometimes within a week.
If you use Lovable, Bolt, Replit, or Base44, you won’t directly see GPT-5.5 anywhere. These tools make their own model routing decisions and don’t expose them as a user-facing setting. Bolt went to Sonnet 4.6 earlier this month. Lovable is on Opus 4.7. Expect at least one of these to add GPT-5.5 to their routing mix within two weeks, likely as part of a quietly-shipped upgrade that gets announced after the fact.
The API price for GPT-5.5 is 2x GPT-5.4. That will eventually show up somewhere — in your Cursor bill, in a tool’s “credit” consumption rate, or in a pricing page update. Don’t expect tools to absorb the increase.
What it changes about your tool stack
For most non-technical founders and PMs, GPT-5.5 doesn’t change your tool choices. It changes which model you pick inside the tool you already use. The cases where it might actually move you:
If you’re using Codex occasionally and considering a deeper commitment. The browser use and the 3M weekly-user base make Codex a real alternative to Cursor for founders who don’t code but want to run agents. The path is: start with a Plus subscription ($20/mo), use Codex on your desktop to run multi-step workflows across apps you already use, and don’t worry about writing any actual code. See our OpenAI Codex review for the honest take on whether that holds up.
If you’re using Cursor and you’ve been annoyed at model limits. Cursor now has Composer 2 in Auto mode, Opus 4.7 for serious codebase work, and GPT-5.5 for multi-step agentic tasks. That’s a genuinely strong lineup. If your complaint was “I don’t like what Cursor picks for me” — the manual model selection got better this week.
If you’re evaluating Claude Code vs. Codex vs. Cursor. Two weeks ago the honest answer was “Claude Code for careful codebase work, Cursor for IDE productivity, Codex for one-off agent tasks.” Today’s answer is similar but the gap between Codex and Claude Code has narrowed considerably. For founders evaluating these tools from scratch, it’s now a real three-way decision rather than a Claude Code default with Codex as a fallback. See our Claude Code vs. Cursor and Devin vs. Claude Code breakdowns.
If you’re in the “which model is smartest” rabbit hole. Stop. The practical answer for the next three months is: Claude Opus 4.7 for anything that touches a real codebase, GPT-5.5 for anything that runs a workflow across tools. Everything else is noise until Anthropic ships what they’re testing internally as “Mythos.”
The browser-use detail nobody’s talking about loudly enough
Codex with GPT-5.5 can now operate a browser inside the Codex desktop app — not as a separate tool, not as an MCP plugin, but as a first-class capability. That means Codex can: test a landing page you just deployed by actually clicking through it, log into your Stripe dashboard and pull a report, run through your own signup funnel to verify it works, capture screenshots of bugs on your live site. For a non-technical founder, this is a genuinely useful capability — more useful for day-to-day operations than a lot of the “this code is slightly better now” benchmark wins.
The caveat: this is a computer-use feature. Read our AI code security guide before you give any agent login credentials to anything customer-facing. Session tokens leak. Agents misclick. Browser automation is useful precisely because it works — including when it shouldn’t.
What to do this week
Three specific moves.
One: try GPT-5.5 inside the tool you already use. If you’re on ChatGPT Plus, switch to GPT-5.5 for your next three serious tasks and notice where it feels different. The gains are most obvious on multi-step work — anything where the old model would have bailed halfway through.
Two: if Codex has been a “I should try this” item on your list, try it now. The browser use plus the GPT-5.5 upgrade is the first time Codex is clearly worth an afternoon of exploration for non-coders. The Plus plan ($20/mo) already includes it.
Three: don’t change your build stack yet. If you’re shipping with Lovable, Bolt, Replit, or Base44, don’t switch tools because of a model release. Wait to see which of them adds GPT-5.5 to their routing in the next two weeks. The tool that gets good use of the new model will make its own case.
The short version of all of this: frontier models are now shipping every six to eight weeks, and the winner rotates. Pick tools that are good at absorbing whichever model is best this month. That’s the durable bet. The individual model release isn’t.
Enjoying this guide?
Get weekly practical guides, plus tool updates and implementation playbooks.