
A recent social media post has drawn attention to Harmonic, an AI startup, whose mathematical reasoning model, Aristotle, successfully solved a problem that Google's advanced Gemini Deep Think reportedly failed. The problem, described by some as "competition-level" or even "trivial," highlights the differing approaches and performance in the rapidly evolving field of AI problem-solving.
"I have seen some takes that the problem that @HarmonicMath’s Aristotle recently solved was competition-level, maybe even trivial. Maybe so! I’d be interested to see it administered on an exam. Another data point though: Gemini Deep Think messed it up," stated the tweet from user "L."
Harmonic, co-founded by Robinhood CEO Vlad Tenev and Tudor Achim, is dedicated to building "Mathematical Superintelligence" (MSI). Its flagship model, Aristotle, is designed for formal mathematical reasoning, leveraging Lean 4 for formal verification to mitigate AI hallucinations and ensure provably correct solutions. This rigorous approach led Aristotle to achieve Gold Medal-level performance at the 2025 International Mathematical Olympiad (IMO) through formally verified tests.
In contrast, Google's Gemini Deep Think, an enhanced reasoning mode for its Gemini models, employs "parallel thinking techniques" to explore multiple hypotheses simultaneously for complex problem-solving. While Gemini 2.5 Deep Think also attained a gold-medal standard at the 2025 IMO, some reports indicate this was achieved through informal tests using natural language, a distinction from Aristotle's formal verification. The recent setback against Aristotle underscores the challenges even advanced generalist AI models face in achieving consistent, verifiable accuracy in specialized domains.
The incident underscores a critical debate in AI development: the balance between broad, generalist AI capabilities and specialized, formally verified reasoning. Harmonic's focus on mathematical rigor and hallucination-free outputs has attracted significant investment, with the company recently raising $120 million in Series C funding, valuing it at $1.45 billion. This investment reflects a growing industry demand for AI systems that can provide not just answers, but provably correct solutions in high-stakes scientific and engineering applications.