Anticipated ARC-AGI Evaluation Score Updates Nearing, Giotto.ai Leads Current Leaderboard at 24.58%

The artificial intelligence research community is anticipating significant updates to the Abstract and Reasoning Corpus for Artificial General Intelligence (ARC-AGI) evaluation scores, potentially within the next month. This expectation stems from a recent tweet by Chris, who stated, > "One month from now we might see some new scores on the Arc AGI eval." These upcoming scores are crucial indicators of progress in the ongoing ARC Prize 2025 competition, a global effort to advance Artificial General Intelligence.

The ARC-AGI benchmark, created by François Chollet in 2019, is designed to measure fluid intelligence in AI systems, focusing on their ability to reason and solve novel problems without extensive prior knowledge. Unlike many benchmarks that test domain-specific knowledge, ARC-AGI tasks are notably easy for humans but remain exceptionally challenging for even the most advanced AI models, making it a critical barometer for true AGI capabilities. The benchmark has evolved, with ARC-AGI-2 introduced to further stress-test state-of-the-art AI reasoning systems.

The ARC Prize 2025 competition, which commenced on March 26, 2025, and is set to conclude on November 3, 2025, utilizes the ARC-AGI-2 dataset. Participants aim to achieve an ambitious 85% accuracy on the private evaluation dataset to unlock the grand prize. The competition emphasizes open-sourcing solutions to accelerate collective progress towards AGI.

As of the latest updates, Giotto.ai is currently leading the Kaggle leaderboard for ARC Prize 2025 with a score of 24.58%. A recent press release from Giotto.ai confirmed their leading position with a 22.36% score, highlighting their proprietary deep reasoning architecture. While this marks a significant achievement, it underscores the substantial gap that still exists between current AI performance and the 85% target for the grand prize, and more broadly, human-level fluid intelligence.

Past breakthroughs include OpenAI's o3 model, which achieved 75.7% on the earlier ARC-AGI-1 benchmark in late 2024. However, the ARC-AGI-2 iteration presents new challenges, requiring novel approaches beyond mere scaling. The anticipated score releases will provide further insights into the current state of AI reasoning and the ongoing quest for Artificial General Intelligence.