OpenAI's GPT-5 in Codex CLI Praised for Sustained Performance Over Anthropic's Opus

Image for OpenAI's GPT-5 in Codex CLI Praised for Sustained Performance Over Anthropic's Opus

Yam Peleg, a notable figure in the tech community, recently shared a strong endorsement of OpenAI's GPT-5 model when integrated with the Codex Command Line Interface (CLI). Peleg's tweet, stating "The codex cli hype is real, I just tried it. GPT-5 (high) in codex is great," highlighted several key performance advantages over competing models, specifically mentioning "Opus." This assessment underscores the evolving landscape of AI-powered coding tools and the increasing capabilities of large language models in developer workflows.

Codex CLI, an open-source command-line tool developed by OpenAI, brings advanced reasoning models directly to the terminal, allowing developers to read, modify, and run code locally. It functions as a lightweight coding agent, designed to accelerate feature development, debug, and understand complex codebases. The tool supports various models, with GPT-5 being recommended for its superior coding capabilities and advanced reasoning.

OpenAI's GPT-5, released recently, has demonstrated state-of-the-art performance across key coding benchmarks, scoring 74.9% on SWE-bench Verified and 88% on Aider Polyglot. The model is engineered to be a collaborative coding assistant, excelling at generating high-quality code, fixing bugs, and answering questions about complex codebases. According to OpenAI, GPT-5 achieves these results with greater efficiency, using fewer output tokens and tool calls compared to its predecessor, o3.

Peleg's observations align with GPT-5's touted improvements, noting that the model "stays on track much longer than opus" and "never 'gives up' on your task even if takes a while." This speaks to GPT-5's enhanced ability to maintain coherence and persistence through multi-step coding challenges. Additionally, Peleg praised its "much longer context window" and its direct, non-argumentative responses, stating it's "not arguing & 'you're absolutely right'ing."

The comparison to "Opus" likely refers to Anthropic's Claude Opus 4.1, a direct competitor in the advanced AI coding model space. Claude Opus 4.1, released in early August, also boasts impressive coding prowess, achieving 74.5% on SWE-bench Verified and excelling in agentic tasks and multi-file code refactoring. Anthropic positions Opus as its most intelligent model, capable of handling complex, long-running tasks and operating within its own Claude Code environment.

The ongoing advancements in models like GPT-5 and Claude Opus signify a rapid evolution in AI-assisted software development. Developers are increasingly leveraging these tools for tasks ranging from code generation and debugging to complex architectural planning. The positive user feedback on GPT-5's performance within the Codex CLI highlights its potential to significantly streamline development workflows and enhance productivity for engineers.