Strawberry AI is Here: OpenAI Release ‘o1’ Advanced Reasoning Models


openai-o1-model-released-strawberry-azmotech

After months of anticipation, OpenAI has unveiled its new series of advanced reasoning models, previously known as Strawberry AI, under the ‘o1’ name. The new models include OpenAI o1, OpenAI o1-preview, and OpenAI o1-mini. The preview and mini models are now available to ChatGPT Plus subscribers, with OpenAI o1-mini expected to be accessible to free ChatGPT users at a later date.

According to OpenAI, the o1 models take additional time to generate responses, but they excel at “reasoning through complex tasks” and tackling difficult problems in fields like math, science, and coding. The company also claims that these new reasoning models perform on par with PhD students when it comes to challenging science topics.

openai-o1-model-benchmark-against-gpt-4o-azmotech

For comparison, the OpenAI o1 model achieved an 83% score on the challenging International Mathematics Olympiad (IMO), whereas GPT-4o solved only 13% of the problems. In the Codeforces competition, the o1 model reached the 89th percentile, while GPT-4o was at the 11th percentile.

openai-o1-benchmarks-azmotech

The OpenAI o1 model scored 92.3 on the MMLU benchmark and 94.8 on the MATH benchmark. OpenAI asserts that in tasks requiring extensive reasoning, the o1 model’s performance closely matches that of human experts, which is a notable achievement.

The o1 models utilize a chain-of-thought technique enhanced by reinforcement learning. This approach involves breaking down tasks into simpler steps and applying various strategies to each step until the correct conclusion is reached. Currently, o1 models only support textual input and cannot browse the web or analyze files and images.


What's Your Reaction?

hate hate
200
hate
confused confused
601
confused
fail fail
400
fail
fun fun
334
fun
geeky geeky
266
geeky
love love
67
love
lol lol
133
lol
omg omg
601
omg
win win
401
win

0 Comments

Your email address will not be published. Required fields are marked *