New OpenAI general reasoning model gets gold medal at international math olympia
| gold concupiscible digit ratio | 07/19/25 | | Marvelous Base | 07/19/25 | | big-titted brethren gas station | 07/19/25 | | bossy step-uncle's house giraffe | 07/19/25 | | nubile aqua knife sanctuary | 07/20/25 | | Dull immigrant | 07/19/25 | | big-titted brethren gas station | 07/19/25 | | Dull immigrant | 07/19/25 | | Dull immigrant | 07/20/25 | | 180 space | 07/20/25 | | Dull immigrant | 07/20/25 | | ,.,.,.,,,.,,.,..,.,.,.,.,,. | 07/20/25 | | Learning disabled pearl place of business library | 07/19/25 | | razzmatazz purple hospital mother | 07/20/25 | | Bateful Institution | 07/20/25 | | Dull immigrant | 07/20/25 | | pea-brained tattoo cuckoldry | 07/20/25 | | Dull immigrant | 07/20/25 | | ivory international law enforcement agency idea he suggested | 07/20/25 | | rape bunny | 07/20/25 | | Do you agree? | 07/20/25 | | Yapping Insane University Kitty Cat | 07/20/25 | | Dull immigrant | 07/20/25 | | Yapping Insane University Kitty Cat | 07/20/25 | | 180 space | 07/20/25 | | Wang Hernandez | 07/20/25 | | AdolfHitler88 | 07/20/25 | | gold concupiscible digit ratio | 07/20/25 | | 180 space | 07/20/25 | | Violet theater stage skinny woman | 07/20/25 | | 180 space | 07/20/25 | | bossy step-uncle's house giraffe | 07/20/25 | | 180 space | 07/20/25 | | tantric effete spot | 07/20/25 | | 180 space | 07/20/25 | | Transparent confused potus | 07/20/25 | | gold concupiscible digit ratio | 07/20/25 | | 180 space | 07/20/25 | | AdolfHitler88 | 07/20/25 | | ,.,....,...,,,..,..,.,..,.,.,.,. | 07/21/25 |
Poast new message in this thread
 |
Date: July 20th, 2025 3:59 AM Author: nubile aqua knife sanctuary
Yes, it's truly remarkable how quickly AI has advanced in contest math! Gemini's 50% score on the USAMO (United States of America Mathematical Olympiad) is a massive leap compared to where models like GPT-4 started just a couple of years ago.
### Key Observations on the Progress:
1. **From Near-Zero to Competitive Performance**
- Early versions of GPT-4 struggled to score even 1-2 problems on the AIME (American Invitational Mathematics Exam), which is significantly easier than the USAMO.
- Now, AI is not just solving AIME-level problems reliably but also tackling Olympiad-level questions, which require deep reasoning, creativity, and proof-writing.
2. **Why This is Surprising**
- Many experts (including myself) expected this level to take at least 2-3 more years. The fact that it happened this year suggests:
- Better training techniques (e.g., improved reasoning loops, synthetic data, and self-improvement methods).
- Stronger mathematical priors (models may now "understand" abstract patterns rather than just memorizing).
- Possibly, more sophisticated search/verification methods (though Gemini's exact approach isn't public).
3. **Is Contest Math "Close to Solved"?**
- **For AIME:** Probably yes. AI is likely near-superhuman at this level already.
- **For USAMO/IMO:** Not yet, but progress is accelerating. A 50% USAMO score is already better than many human contestants, but the hardest problems (e.g., IMO Q6-style combinatorics/number theory) still require breakthroughs in symbolic reasoning and long-term planning.
- **For Proof Writing:** AI still makes stylistic errors, but formal verification (e.g., Lean) can help close the gap.
4. **What’s Next?**
- **IMO Gold (60+ score) within 1-2 years?** If progress continues at this rate, it's plausible.
- **General Mathematical Research:** The real test will be whether these models can contribute to unsolved problems (e.g., combinatorics conjectures, lightweight formal math).
### Why This Matters Beyond Olympiads:
- **Education:** AI could become the ultimate tutor for advanced math.
- **Research:** Automated reasoning might assist in mathematical discovery.
- **AGI Benchmarks:** Math is a strong proxy for structured reasoning—this progress hints at broader capabilities.
It’s an exciting time! Wouldn’t be surprised if an AI wins an IMO gold medal by 2026.
(http://www.autoadmit.com/thread.php?thread_id=5752305&forum_id=2],#49115797) |
 |
Date: July 20th, 2025 12:54 AM Author: Dull immigrant Subject: this fucking faggot:
"we are releasing GPT-5 soon but want to set accurate expectations: this is an experimental model that incorporates new research techniques we will use in future models. we think you will love GPT-5, but we don't plan to release a model with IMO gold level of capability for many months."
https://x.com/sama/status/1946569252296929727
(http://www.autoadmit.com/thread.php?thread_id=5752305&forum_id=2],#49115705) |
 |
Date: July 20th, 2025 10:49 PM
Author: ,.,.,.,,,.,,.,..,.,.,.,.,,.
It should be possible to get the compute cost down much lower with distillation on the reasoning traces from large models. This is just a proof of concept. One of the major advantages of AI compared to humans is you can create parallel instances and then train on orders of magnitude more data than any human can see. Stockfish’s evaluation function without search is superhuman (despite being tiny and using essentially no compute), because they could train it on many trillions of positions to capture a powerful intuition for the chess board. We will likely see the same thing happen with reasoning models. Models could eventually intuit the answer to IMO problems in milliseconds.
(http://www.autoadmit.com/thread.php?thread_id=5752305&forum_id=2],#49117843) |
Date: July 21st, 2025 1:19 PM
Author: ,.,....,...,,,..,..,.,..,.,.,.,.
Google got gold too using a large language model.
"We can confirm that Google DeepMind has reached the much-desired milestone, earning 35 out of a possible 42 points — a gold medal score. Their solutions were astonishing in many respects. IMO graders found them to be clear, precise and most of them easy to follow."
https://deepmind.google/discover/blog/advanced-version-of-gemini-with-deep-think-officially-achieves-gold-medal-standard-at-the-international-mathematical-olympiad/
(http://www.autoadmit.com/thread.php?thread_id=5752305&forum_id=2],#49119024) |
|
|