Grok 4 Achieves 50.7% Accuracy on Humanity's Last Exam, Leading AI Field

Recent reports indicate that xAI's Grok 4 artificial intelligence model is demonstrating significant dominance in competitive benchmarks, particularly in advanced reasoning tasks. This development was highlighted in a social media post by Adam Lowisz, who stated in a tweet to Elon Musk, "@elonmusk @aarnogau Grok is dominating the competition 🔥." The sentiment reflects a growing consensus among AI observers regarding Grok's latest capabilities.

According to data from mid-2025, Grok 4 achieved a 50.7% accuracy score on Humanity's Last Exam (HLE) when utilizing tools, a figure roughly double the no-tool scores of competitors like GPT-4 and Claude. This performance underscores Grok 4's strong reasoning abilities. Earlier versions, such as Grok 3, also showcased formidable results, including a 95.8% score on the AIME 2024 benchmark and breaking the 1400 score on the Chatbot Arena (LMSYS) leaderboard.

Grok's competitive edge is further solidified by its unique features, including real-time internet access and deep integration with the X platform, allowing it to provide up-to-the-minute information and social media insights. Its "sassy" personality and unfiltered responses also distinguish it from more formal AI counterparts. This combination of raw processing power and current data access positions Grok as a formidable contender in the rapidly evolving AI landscape.

The artificial intelligence market in 2025 is intensely competitive, with major players like OpenAI's ChatGPT (GPT-4o, GPT-5), Anthropic's Claude (Claude 4, Opus 4), and Google's Gemini (2.5 Pro) constantly releasing updated models. While Grok 4 excels in reasoning and real-time data, other models offer specialized strengths, such as Claude 4 for coding and Gemini 2.5 Pro for multimodal tasks and extensive context windows. The choice of AI often depends on specific user needs and priorities.

The continuous advancements by xAI and its rivals signify an ongoing acceleration in AI development, pushing the boundaries of what these models can achieve. As AI capabilities expand, the competition is expected to intensify, driving further innovation and offering increasingly sophisticated tools for various applications across industries.