Fez Five teenage contestants achieved perfect scores at the International Mathematical Olympiad in Queensland, edging past gold-level 35-point showings by Google DeepMind’s Gemini and an experimental OpenAI reasoning model.

Google revealed that its latest Gemini version cracked five of the contest’s six proof problems within the strict four-and-a-half-hour window, matching the 35/42 points that define a gold medal. 

OpenAI said its own prototype model reached the same score after former Olympiad medalists reviewed the AI’s step-by-step proofs. Even so, five human competitors—each under 20—wrote flawless six-problem solutions, reinforcing the event’s reputation as a playground where youthful insight still outpaces silicon logic.

The Olympiad challenges entrants with deep algebra, combinatorics, geometry, and number-theory questions that demand elegant, multi-page proofs rather than quick numerical answers.

Organizers welcomed the AI progress, noting that last year Google’s system needed several days of computing to solve four problems, while the new run finished inside contest time. 

A Google AI, which previously took days and extensive computing to solve four problems, now completes tasks within contest time limits, showcasing rapid advancements in efficiency and speed.

“Their solutions were clear and precise,” said  IMO president Gregor Dolinar, though he cautioned that the exact computing resources— or any human assistance—remain unknown.

Roughly ten percent of the 641 students from 112 countries earned gold medals, placing the two AI models in elite company but still short of perfection. 

Analysts say the milestone hints at future collaborations in which large language models help mathematicians brainstorm conjectures or draft formal proofs, yet the gap illustrates how pattern recognition alone cannot fully replace human intuition—particularly when an unexpected trick unlocks a problem.

Google DeepMind and OpenAI plan to publish technical papers so academics can reproduce the results, while IMO organizers are debating whether to introduce an AI division or tighten secrecy around future problem sets. 

For now, the scoreboard reads: humans 42, machines 35—a reminder that sharp pencils and fresh minds continue to set the pace in the world’s toughest math meet.