Anthropic has announced the release of Claude 3.7 Sonnet, a new AI model characterized as a “hybrid AI reasoning model.” Claude 3.7 Sonnet is designed to offer real-time answers as well as more deliberate, considered responses, allowing users to choose the duration for which the AI “thinks” about a given question. This innovation aims to streamline the user experience by eliminating the need to choose between multiple models of varying costs and capabilities.
Set for release to all users and developers on the upcoming Monday, Claude 3.7 Sonnet’s reasoning abilities will be available only to those subscribed to Anthropic’s premium plans. Users of the free version will have access to a standard model, which is said to surpass its predecessor, Claude 3.5 Sonnet. The model is priced at $3 per million input tokens and $15 per million output tokens, placing it at a higher cost level compared to competitors like OpenAI’s o3-mini and DeepSeek’s R1. However, unlike these models, Claude 3.7 Sonnet incorporates hybrid capabilities.
Claude 3.7 Sonnet introduces “reasoning,” a feature not commonly found in current AI models, which other labs are turning to due to diminishing returns from traditional performance improvement methods. This involves breaking down problems into smaller steps to enhance answer accuracy, though it does not equate to human-like thinking. Anthropic aims for the model to eventually decide autonomously how long to “think” about questions without user input for control.
Anthropic shares that Claude 3.7 Sonnet provides visibility into its internal planning via a “visible scratch pad,” revealing its thought processes, albeit with some parts possibly redacted for safety reasons. The model is optimized for complex tasks like coding problems and agentic work, with the capability for developers to adjust parameters for faster or higher-quality outputs depending on their needs.
In testing, Claude 3.7 Sonnet outperformed in various benchmarks, showing a 62.3% accuracy in the SWE-Bench coding test, compared to 49.3% by OpenAI’s o3-mini, and an 81.2% score in the TAU-Bench retail interaction test against 73.5% by OpenAI’s o1 model. The AI model also exhibits a reduced frequency of unnecessary refusals by 45% in comparison to earlier iterations, which comes as some AI labs reassess their content restriction strategies.
In addition to Claude 3.7 Sonnet, Anthropic is unveiling Claude Code, an agentic coding tool available as a research preview. This tool enables developers to execute tasks directly from their terminal, allowing codebase modifications and project testing, with Claude providing explanations and changes in real-time.
Claude Code will initially be available on a limited “first come, first serve” basis. Anthropic’s release of Claude 3.7 Sonnet comes amidst rapid advancements in AI model deployment and signals the company’s intent to secure a leading position within the industry. However, future competition looms as OpenAI prepares to release its own hybrid AI model in the coming months.