@nasimborazjani
ID: 1796236801826549760
calendar_today30-05-2024 17:46:53
7 Tweet
36 Followers
12 Following
6 days ago
🚨 OpenAI's new o1 model scores only 38.2% in correctness on our new benchmark of combinatorial problems, SearchBench (arxiv.org/abs/2406.12172), while 57.1% is possible with GPT-4 and A* MSMT prompting! 🚨