From Genius to Goof: My Jaw-Dropping Experience with GPT-4o After Switching from Inference AI
- Open AI has unveiled a new hypothesis-specific AI model codenamed 'Strawberry', which has shown better results in terms of depth of thought and rationality of thought than the...
- Inference refers to the act and process of drawing logical conclusions from facts already known or ascertained.
- Assuming the laws of Earth physics, a small strawberry is placed in a regular cup and the cup is placed upside down on the table.
Open AI Unveils ‘Strawberry’ – A Hypothesis-Specific AI Model
Digital Daily
Release Date 2024-09-13 10:25:00
Open AI’s ‘Strawberry’ Model Outperforms GPT-4o in Reasoning and Problem-Solving
Open AI has unveiled a new hypothesis-specific AI model codenamed ‘Strawberry’, which has shown better results in terms of depth of thought and rationality of thought than the existing GPT model.
Inference refers to the act and process of drawing logical conclusions from facts already known or ascertained. Based on this, the next problem is one that requires the AI to make inferences based on physical knowledge and situational changes.
Q. Assuming the laws of Earth physics, a small strawberry is placed in a regular cup and the cup is placed upside down on the table. Someone picked up the cup and put it in the microwave. Where are the strawberries? and explain why.
The answer to this problem is, with a little common sense thinking, ‘strawberries will be on the table.’ However, the existing GPT-4o, which is not a hypothesis-specific model, answered that the strawberry would still be attached to the cup.
On the other hand, GPT-o1 determined the correct answer as follows. This is because, based on common sense, we correctly understood that when the cup is overturned, the strawberries in the cup will fall on the table and the cup is nothing more than a cover for the strawberries. Based on this premise, when asked where the strawberries were when the cup was moved to the microwave, the assumption was made that ‘the strawberries must be on the table.’

In particular, the o1 model is also distinguished in that it explains the logic process to the user. This is the part where you actually see how logically the AI has thought and come up with the answer.
Looking at the image above, you can see that the AI made the premise that ‘when the cup is picked up, the strawberry can stay on the table or move with the cup’ and then generated the final answer based on scientific laws. As a result, such reasoning problems are very simple, but problems that require in-depth research or discussion are expected to be useful in organizing the questioner’s thinking, verifying hypotheses, and creating more logical programming code.

The reason this o1 model is stronger in inference than previous ones is because it was learned in a ‘chain of thought’ manner through a large-scale reinforcement learning algorithm. Additionally, more time is spent on the estimation process to improve the answer, resulting in more accurate results. The existing GPT model mainly generates answers within the textual patterns of the learned data, so it was relatively weak in complex estimation.
On the other hand, due to this structural difference, the o1 model takes longer to answer general problems than the GPT-4o. Therefore, it is more appropriate to use the existing GPT-4o when obtaining everyday questions or simple answers, and to use o1 when complex problem solving and reasoning processes are needed.
