On Thursday, Anthropic unveiled two new AI models in the Claude 4 lineup Claude Opus 4 and Claude Sonnet 4. According to Anthropic, Claude Opus 4 is the “world’s best coding model,” delivering consistent performance on extended, complex workflows. Meanwhile, Claude Sonnet 4 offers improved coding and reasoning capabilities compared to its predecessor, Claude Sonnet 3.7.
Let’s start with the Claude Opus 4 AI model. On the SWE-bench verified benchmark, which evaluates real-world software engineering tasks, Claude Opus 4 scored 72.5%, edging out OpenAI’s top coding model, Codex-1, which scored 72.1%. Even more impressive, when using parallel test-time compute similar to Deep Think mode in Gemini 2.5 Pro Opus 4 reached an outstanding 79.4%.
Interestingly, the Claude Sonnet 4 model scores 72.7% on the SWE-bench, and when using parallel test-time compute, it reaches 80.2% accuracy surpassing the coding performance of the larger Opus 4 model.
Anthropic explains that the Claude Sonnet 4 model “balances performance and efficiency for internal and external use cases, with enhanced steerability for greater control over implementations. While not matching Opus 4 in most domains, it delivers an optimal mix of capability and practicality.“
Claude Opus 4 shines in handling complex, long-duration tasks and agentic workflows, whereas Claude Sonnet 4 delivers a solid blend of coding performance and efficiency. Both are hybrid reasoning models, capable of providing quick responses as well as taking extra time for more in-depth reasoning.
Anthropic also highlights that when Claude Opus 4 has access to local files, it can store important information in a memory file. For instance, while playing Pokémon, the model generated a navigation guide file to enhance its gameplay.
Lastly, regarding safety, Anthropic has introduced AI Safety Level 3 (ASL-3) for the Claude Opus 4 model, marking a first in their Responsible Scaling Policy (RSP). They’ve put in place Constitutional Classifiers and other safeguards to block jailbreaking attempts.
The Claude 4 models are being rolled out to all paid users across Pro, Max, Team, and Enterprise plans. Fortunately, Claude Sonnet 4 is also available to free users, though without the extended thinking feature.