An experimental artificial intelligence model developed by OpenAI has achieved a gold medal-level performance in the International Mathematical Olympiad (IMO), a globally renowned competition for high school students. The breakthrough was announced by OpenAI researcher Alexander Wei, who stated the model successfully solved five out of six problems from the 2025 IMO, earning 35 out of a possible 42 points. This score is sufficient to qualify for a gold medal, marking a significant milestone in AI's advanced reasoning capabilities.
The evaluation of the model's performance was rigorous, with three former IMO medalists independently grading its submitted proofs to ensure accuracy and consensus. Wei clarified that this advanced reasoning LLM is currently an experimental research model and is not planned for public release for several months. Its success underscores OpenAI's approach of leveraging general-purpose reinforcement learning and test-time compute scaling rather than narrow, task-specific methodologies.
This achievement represents a rapid advancement in AI's mathematical prowess, particularly when viewed against previous benchmarks. As noted in a recent social media post by "Chubby♨️," OpenAI has progressed from an approximate 12% success rate on the American Invitational Mathematics Examination (AIME) just 15 months prior to this IMO gold-level performance. An OpenAI executive had previously indicated that earlier models like GPT-4 scored minimally on AIME, while "o1-preview" had reached 50%, signaling a swift improvement trajectory.
The implications of such rapid progress are profound, with speculation arising about AI's future capacity to generate novel mathematical theorems. The original tweet highlighted this potential, stating, > "In just 15 months, they have gone from AIME 12% to gold in the IMO. The conclusion: Next year, the models could be so good that they develop new theorems. Holy moly." This suggests a future where AI could contribute to fundamental mathematical discoveries, pushing the boundaries of human knowledge.
The success of this experimental model positions OpenAI at the forefront of AI research in complex problem-solving and abstract reasoning. While the model remains in research, its performance signals a transformative period for AI's role in fields traditionally dominated by human intellect. The ongoing development of these models is expected to continue impacting various scientific and engineering disciplines.