Claude Sonnet 5 Launch: Cheaper, Smarter, and Finally Agentic

Anthropic launched Claude Sonnet 5, its most agentic mid-tier model. Here's what changed, what it costs, and why it matters for non-technical builders.

Written by

Bhavyadeep

Reviewed by

Sakthy

Last updated:

July 1, 2026

min read

Table of Contents

Heading

Anthropic launched Claude Sonnet 5 on June 30, bringing near-flagship AI performance to the model tier most people actually use every day. Here's what changed, what it costs, and why non-technical builders should care.

Why the Sonnet upgrade matters

The AI headlines this year have mostly been about the biggest, most expensive models. Fable 5 and its restricted sibling Mythos 5 dominated the news cycle for weeks, first for their capabilities, then for the U.S. government export controls that pulled them offline. For most people building real products, though, those models were either too expensive, too restricted, or both.

Sonnet is the tier that most Claude users interact with daily. It is the default model on free and paid plans, the one that powers most coding sessions, and the one that app-building platforms route the majority of their work through. A better Sonnet means the apps, automations, and workflows built on top of it get more capable overnight, without users changing anything on their end.

Claude Sonnet 5 is now the default model across free and paid plans, succeeding Sonnet 4.6. The company calls it the most agentic Sonnet it has ever shipped, a model that can plan multi-step tasks, use tools like browsers and terminals, and run autonomously at a level that recently required larger, more expensive models. The pitch is straightforward: near-Opus-4.8 performance at a fraction of the cost.

What it can actually do

Sonnet 5 improves across three areas that matter most for daily use: raw benchmark performance, agentic reliability on multi-step tasks, and safety. Here is what the numbers and early testing show in each.

Benchmarks: closing the gap to Opus

Anthropic's core claim is that Sonnet 5 closes most of the gap to Opus 4.8, its flagship model, while staying at Sonnet-tier pricing. The benchmarks tell that story clearly.

Source

On SWE-bench Pro, a coding evaluation that tests models against real-world software engineering tasks, Sonnet 5 scores 63.2%. That is up from Sonnet 4.6's 58.1%, and within striking distance of Opus 4.8 at 69.2%.

On Terminal-Bench 2.1, which measures a model's ability to operate inside a command-line environment, Sonnet 5 scores 80.4%, up from Sonnet 4.6's 67.0%, and actually surpasses Opus 4.8's 74.6%. On the GDPval-AA v2 knowledge work benchmark, Sonnet 5 edges past Opus 4.8 with a score of 1,618 to Opus's 1,615.

Agentic capability: finishing what it starts

The most important shift is not in any single score, though. It is in what Anthropic calls "agentic" capability, the model's ability to take a multi-step task and actually finish it without stalling partway through.

Previous Sonnet models had a known habit of stopping short. They could start a complex job, plan the first few steps, and then either ask for permission to continue or quietly drop part of the task. Sonnet 5, according to Anthropic and its early testers, pushes through. It also checks its own output without being explicitly asked to, catching errors and verifying results before handing work back.

Source

Anthropic's BrowseComp evaluation, which measures agentic search performance, puts numbers to this. At medium effort, Sonnet 5 hits roughly 72% pass rate at around $4 per task. Sonnet 4.6 needs to run at high effort and spend over $20 per task just to reach a similar 75%. Even at low effort, Sonnet 5 (60% pass rate, under $2 per task) already closes in on what Sonnet 4.6 achieves at low effort for roughly $12. At the top end, Sonnet 5 at max effort actually edges past Opus 4.8: ~85% vs ~84%, and at a lower cost per task (~$22 vs ~$27).

One caveat: the chart reflects standard pricing ($3/$15 per million tokens). With introductory pricing ($2/$10) running through August 31, Sonnet 5's real cost per task is even lower than shown.

Safety and guardrails

On safety, Anthropic reports that Sonnet 5 shows lower rates of hallucination and sycophantic behavior than its predecessor. It is better at refusing malicious requests and resisting prompt injection attacks.

Source

It ships with real-time cybersecurity safeguards enabled by default, though these are lighter than the guardrails applied to the higher-tier Fable 5. Anthropic says Sonnet 5 was not deliberately trained on cybersecurity tasks and poses substantially less cyber risk than its Opus models.

What early testers are reporting

The benchmark numbers are one thing. What early-access partners describe in practice is more useful.

A senior engineer at Zapier described handing Sonnet 5 a two-part automation job: updating Salesforce account tiers, then sending a launch announcement to enterprise contacts. Previous models stalled halfway. Sonnet 5 finished the full task end to end. For day-to-day automation, the engineer called it "a no-brainer."

At CodeRabbit, the AI code review platform, testers ran Sonnet 5 against dozens of challenging pull requests. Their conclusion: the model carried each one through to a tested, verified result on its own, freeing engineers to focus on judgment and sign-off rather than babysitting the process. CodeRabbit's review noted that Sonnet 5 tends to write tests first, build the feature on top of them, and then run everything once it thinks it's done. When a bug was reported, the model wrote a reproducing test, implemented the fix, then stashed the fix to confirm the bug returned without it, all in a single pass.

TechCrunch framed the launch as confirmation that agentic capability is now the baseline expectation at every model tier. Google's Gemini 3.5 Flash made a similar mid-tier agentic pitch at I/O in May, and even OpenAI's flagship GPT-5.6 Sol, previewed days earlier, leads with agentic capability as its headline feature. The differentiator going forward is not whether a model can do agentic work, but how cheaply and reliably it can do it without human oversight.

What it costs, and the pricing window to know about

Sonnet 5 is available across every Claude plan: it is now the default model for Free and Pro users, and available to Max, Team, and Enterprise subscribers. It is also live in Claude Code and on the Claude Platform API.

Through August 31, 2026, Anthropic is offering introductory pricing at $2 per million input tokens and $10 per million output tokens. After that, it moves to standard pricing at $3 per million input tokens and $15 per million output tokens.

For context, Opus 4.8 costs $5 per million input tokens and $25 per million output tokens. That makes Sonnet 5 roughly 60% cheaper at introductory rates and still significantly cheaper at standard pricing. TechCrunch noted that Sonnet 5 is also cheaper than OpenAI's GPT-5.5 and Google's Gemini 3.1 Pro, though still more expensive than Google's Gemini 3.5 Flash.

One detail worth knowing: Sonnet 5 uses an updated tokenizer that can map the same text to roughly 1.0 to 1.35 times more tokens than Sonnet 4.6, depending on the content. Anthropic says the introductory pricing is set to make the transition roughly cost-neutral, but if you are comparing costs at scale, it is worth measuring real prompts rather than estimating from old token counts.

Anthropic has also increased rate limits across Chat, Cowork, Claude Code, and the Claude Platform to accommodate the higher token usage that comes with more capable agentic tasks.

Meanwhile, Fable 5 returns

Anthropic announced that Claude Fable 5 will be restored globally starting July 1, after the U.S. Commerce Department lifted the export controls that had kept it offline since June 12. Fable 5 will be included on Pro, Max, Team, and select Enterprise plans for up to 50% of weekly usage limits through July 7, after which it moves to usage credits. The return comes with updated cybersecurity safeguards. For the full backstory, read our coverage of the original Fable 5 launch and the export controls that took it offline.

What this means for builders

The pattern across AI labs right now is clear: the models that most people use daily are getting dramatically more capable, and the price to access that capability is falling. A few months ago, the kind of multi-step, self-correcting work Sonnet 5 now handles required Opus-tier pricing or higher. Now it is the default for anyone who opens Claude.

For non-technical builders, the practical implication is that the tools built on top of these models, the app-building platforms, the automation services, the no-code workflows, just got meaningfully better without anyone needing to change plans or pay more. When the base model can plan a task, execute multiple steps, and check its own work, the gap between describing what you want and getting working software narrows again.

Sonnet 5 is already live on Emergent. If you have been working on an app idea or refining an existing project, the model powering your builds just improved. Describe what you want to build, and see how the latest model handles it.

Stay tuned to Emergent News for more updates from the world of AI and app building.