Introducing OpenAI Strawberry 🍓 o1-preview

4 min readSep 13, 2024

OpenAI has actively started releasing its new series of reasoning models for soving hard and complex problems.

Now OpenAI Strawberry 🍓 (o1) is out!

What are OpenAI’s new o1 models and how do they “think”? OpenAI has released two new preview models yesterday, o1-preview and o1-mini, designed to spend more time “thinking” before responding, claiming to improve their reasoning capabilities for complex tasks. Unlike older models, o1 pauses to “think” before it responds. That brief moment is all about delivering thoughtful, accurate answers — especially for tough questions in areas like math, science, and coding. After all the initial buzz about a limited rollout, o1 is now available — at least in part. OpenAI is managing user feedback and refining the model based on real-world interactions. Expect continued improvements as it gets more widely adopted. This model isn’t just about generating text — it’s about reasoning. With o1, AI is stepping into more advanced roles in problem-solving across industries, paving the way for innovative breakthroughs.

🍓 Is test-time compute all you need 🤣? This new OpenAI o1 is here, reportedly surpassing human PhD-level accuracy on benchmarks in physics, biology, and chemistry!

🧠 The model uses a “hidden” chain-of-thought process, enabling it to think through problems in a more human-like way (whatever that means 😅 )
🕰️Turns out that this deeper reasoning significantly boosts performance during test time, allowing for better, more accurate results with prolonged processing (10–20 seconds).
⛳ The more time the model takes to analyze a task, the stronger and more precise its outcomes tend to be.

📈 Performance and Benchmarks
⛳Programming: Ranked in the 89th percentile in Codeforces, showcasing advanced problem-solving and coding abilities. It’s not just generating code — it’s tackling complex problems like a pro. Imagine having an AI partner with real-world problem-solving skills!
⛳Mathematics: Placed in the top 500 in the USA Math Olympiad, solving 74% of problems, exceeding GPT-4o’s performance.
⛳Science: Surpassed PhD-level experts on physics, biology, and chemistry benchmarks (GPQA).

💰Pricing Alert: For developers, accessing o1 will cost $15 per 1 million input tokens and $60 per 1 million output tokens via API. Why so steep? It’s specialized for complex problem-solving — think of it as paying for premium AI intelligence.

🔒 “reasoning output tokens” are hidden from the user in the UI and ChatGPT but billed (you pay for something you don’t see)

🚫 At present — No system prompts, streaming, tool usage, batch calls, or image inputs support

💰 API access limited to high-tier accounts (min. $1,000 spent)

📊 Increased output token limits (32,768 for o1-preview, 65,536 for o1-mini) probably for thinking.

OpenAI has also introduced o1-mini, a smaller, faster, and more affordable version of the o1-preview model that’s especially good at coding tasks. It’s 80% cheaper, making it a great option for developers who need powerful reasoning abilities without breaking the bank.

A good visualization from Tom Yeh is shown below. How does OpenAI train the Strawberry🍓 (o1) model to spend more time thinking? This is just for illustration and guess work done by Tom on how this model would have trained. I believe this is done in the similar way below.

💡In RLHF+CoT, the CoT tokens are also fed to the reward model to get a score to update the LLM for better alignment, whereas in the traditional RLHF, only the prompt and response are fed to the reward model to align the LLM.

💡At the inference time, the model has learned to always start by generating CoT tokens, which can take up to 30 seconds, before starting to generate the final response. That’s how the model is spending more time to think!

There are other important technical details missing, like how the reward model was trained, how human preferences for the “thinking process” were elicited…etc.

Finally, as a disclaimer, this animation represents Tom Yeh’s best educated guess. We can’t verify the accuracy at present. We do wish someone from OpenAI can jump out to correct this chart animation. Because if they do, we will all learn something useful! 🙌

References:

Credits (Linkedin family) : Jim Fan , Sonu Kumar, Philipp Schimid, Tom Yeh, Aishwarya
Blog: https://lnkd.in/ezAzb-Fp
https://openai.com/index/introducing-openai-o1-preview/

Introducing OpenAI Strawberry 🍓 o1-preview

Now OpenAI Strawberry 🍓 (o1) is out!

Written by Shravan Kumar