Anthropic's Claude Opus 4.1 Achieves 74.5% on SWE-bench Verified, Advancing AI Coding Capabilities

Anthropic officially released Claude Opus 4.1 on August 5, 2025, marking a significant upgrade to its flagship artificial intelligence model. The new iteration boasts enhanced performance across critical areas including coding, instruction following, and handling complex queries. According to a recent tweet by Rohan Paul, Claude Opus 4.1 is "really really good" and stands out as "the best non-thinking model and practically matching GPT-5 across on LMArena" for various demanding tasks.

The latest model, Claude Opus 4.1, significantly advances state-of-the-art coding performance, achieving an impressive 74.5% on the SWE-bench Verified benchmark, which assesses real-world coding problems. This represents a notable improvement over its predecessor, Claude Opus 4, particularly in multi-file code refactoring and precision debugging. Anthropic also highlights its enhanced capabilities in in-depth research, data analysis, and agentic task handling, supported by a robust 200,000-token context window.

While the tweet suggests Claude Opus 4.1 is "practically matching GPT-5," industry analysis indicates distinct strengths for each model. Claude Opus 4.1 specializes in precision-focused coding and enterprise development, whereas OpenAI's GPT-5, released around the same time, excels in broader multimodal applications (text, image, audio, video) and offers a larger context window of up to 1,000,000 tokens. Both models represent significant leaps in AI capabilities, catering to different specialized and general-purpose use cases.

Claude Opus 4.1 is currently available to paid Claude users and within Claude Code, as well as through Anthropic's API, Amazon Bedrock, and Google Cloud's Vertex AI. Its pricing remains consistent with the previous Opus 4 version, set at $15 per million input tokens and $75 per million output tokens. The model's integration into platforms like GitHub Copilot further solidifies its position as a powerful tool for developers.

The release of Claude Opus 4.1 underscores Anthropic's commitment to rapid innovation in the competitive AI landscape. With its specialized focus on complex coding and agentic tasks, the model is poised to become an indispensable asset for developers and enterprises seeking highly accurate and reliable AI assistance in software engineering and data-intensive workflows. This advancement signals a continued push towards more capable and specialized AI systems.