Anthropic released two new Claude models today with a focus on coding and software development.
Claude Opus 4 and Claude Sonnet 4 aim to set “new standards for coding, advanced reasoning, and AI agents,” Anthropic says. The new models can “deliver superior coding” and respond more precisely to user instructions. They can “think” through complex problems more deeply, and search the web along the way.
Opus 4, in particular, is “the world’s best coding model,” Anthropic says, and can operate independently without human intervention. When shopping app Rakuten tested Opus 4, it ran independently for seven hours. Many companies are rapidly adopting AI models this purpose. Microsoft says 30% of its code is already written by AI, and Meta aims for 50% by 2026.
“These models are a large step toward the virtual collaborator—maintaining full context, sustaining focus on longer projects, and driving transformational impact,” says Anthropic.
Anthropic did not increase the price for developers who access the models through its API. Opus 4 is $15/$75 per million tokens (input/output) and Sonnet 4 at $3/$15. OpenAI’s o3 model, which also promises “leading performance on coding,” sits between the two at $10/$40.
Claude Code is also now available to everyone with this release. It integrates the AI model into developers’ existing tools, and helps them get their work done. Claude’s proposed edits appear in-line once installed.
It seems like every AI company these days is offering their “biggest and smartest model yet.” Anthropic backs up its claims by noting Claude 4 is the best at two benchmarks, the SWE-bench (72.5%) and Terminal-bench (43.2%). In the chart below, OpenAI models and Google Gemini 2.5 Pro trail in performance.
Recommended by Our Editors
Claude 4 model scores on key software engineering benchmarks. (Credit: Anthropic)
Since AI benchmarks are notoriously difficult for the layperson to understand, Anthropic has resorted to portraying its progress through video games. It built a way for its models to play Pokémon Red autonomously, livestreamed via Twitch. The Sonnet 3.7 model progressed further in the game than Sonnet 3.5, and now Anthropic says the Claude 4 Models are playing the best yet, thanks to a new ability to store “memory files” of key information.
“This unlocks better long-term task awareness, coherence, and performance on agent tasks—like Opus 4 creating a ‘Navigation Guide’ while playing Pokémon,” Anthropic says.
Claude Opus 4 records key information to help improve its game play. (Credit: Anthropic)
Get Our Best Stories!
Your Daily Dose of Our Top Tech News
By clicking Sign Me Up, you confirm you are 16+ and agree to our Terms of Use and Privacy Policy.
Thanks for signing up!
Your subscription has been confirmed. Keep an eye on your inbox!