Anthropic PBC upgraded its Claude Sonnet model to version 4.6 today, adding stronger computer-use skills, long-context reasoning, agent planning and improved performance for knowledge work and design.
The company said it brings the coveted 1 million token context window, recently introduced in the company’s flagship Opus 4.6 model, to Sonnet in beta test mode.
For users on Free and Pro plans, Sonnet 4.6 is now the default, and pricing remains the same at $3/$15 per million input and output tokens, respectively.
According to Anthropic, the performance of the new model now approaches previous Opus-level capabilities, showing major improvement in computer use skills compared to prior Sonnet models. Sonnet sits in the mid-level of performance-to-cost optimization.
Computer and browser use advancements
Anthropic is particularly focused on the model’s capability to automate computer user interfaces. In October 2024, it introduced a general-purpose, computer-user model. Since then, the company has developed functional capabilities that have become a built-in model capable of taking control of Chrome, working with LibreOffice, VS Code and more.
The company said that since the first computer-use demonstration, Sonnet models have made steady gains. Users have seen human-level capabilities in tasks such as navigating complex spreadsheets or filling out multistep web forms, before pulling information together across multiple browser tabs.
However, Anthropic said the model still lags behind most skilled human reasoning at using computers.
Still, given the rate of progress, it is remarkable to note that computer use is completing a range of work that puts a significant number of human-capable tasks within reach.
“Box evaluated how Claude Sonnet 4.6 performs when tested on deep reasoning and complex agentic tasks across real enterprise documents,” Ben Kus, chief technology officer at Box Inc. commented. “It demonstrated significant improvements, outperforming Claude Sonnet 4.5 in heavy reasoning Q&A by 15 percentage points.”
Anthropic isn’t the only company reaching for this particular constellation of capabilities. Google LLC is also baking in computer and browser use into its Gemini model. The company introduced computer use in Gemini 2.5 and OpenAI Group PBC developed a similar paradigm with advanced multistep browser agents, although not quite general computer-level use.
In the meantime, the company released Claude Cowork: a MacOS desktop app (with a Windows version coming soon) that allows its AI to read and interact with files on users’ computers. It can act as a proactive teammate, capable of controlling the mouse, keyboard and browser to execute multi-step activities such as organizing files, editing documents and browsing the web.
Anthropic noted that with full control of a computer, safety can become a major concern. Risks of hijack, prompt injection and other concerns become paramount. The company said that it has been working to improve resistance to hallucination and external manipulation.
According to internal safety evaluations, Sonnet 4.6 saw major improvements compared to its predecessor and performed similarly to Opus 4.6.
Product additions for developers and end users
Developers are getting a huge boost from the larger 1 million token context window. Early testers of Claude Code reported that Sonnet 4.6 is capable of reading context before modifying code, consolidates logic instead of duplicating it and avoids overengineering and “laziness” that earlier models suffer from.
With 1 million tokens, Sonnet 4.6 is capable of ingesting entire codebases, even extremely large ones, by seeing the entirety of extremely large horizons at once in order to understand full scopes of dependencies at once. This allows it to follow flow paths at longer depths at once.
For business use, this has equally useful implications because it can hold lengthy contracts or dozens of research papers in memory at once and reference them as it does work and reasons through them.
Within the application programming interface, Sonnet 4.6 now supports both adaptive and extended thinking, as well as compaction in beta. That allows users to quickly select optimized features for cost-to-performance and continual execution, even when the context fills up. Context compaction happens when the context window gets too full and the model needs to summarize the conversation to save space so that it can continue to converse without dropping off the oldest information (therefore “forgetting” the oldest knowledge).
Also in the API, Claude’s web search and fetch now automatically writes and executes code to filter search results. Code execution, web fetch, memory and programmatic tool calling are also now generally available. That makes it much more useful for application programming in production.
Model Context Protocol support for Claude in Excel is available for all users with Pro subscriptions and above, providing support for spreadsheet users.
Image: Anthropic
Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.
- 15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
- 11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.
About News Media
Founded by tech visionaries John Furrier and Dave Vellante, News Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.
