OpenAI launched o3-pro, a new version of its most advanced model aimed at delivering more reliable, thoughtful responses across complex tasks. Now available to Pro and Team users in ChatGPT and via API, o3-pro replaces the earlier o1-pro.
Based on the o3 architecture, o3-pro maintains access to tools like Python, file analysis, web browsing, and image interpretation, allowing it to tackle multifaceted problems. The model is designed for users who prioritize correctness and depth over speed. OpenAI cautions that o3-pro responses may take longer to generate than those from lighter models.
Expert and academic evaluations show improvements. OpenAI reports that in “4/4 reliability” testing—where a model must answer the same question correctly four times in a row—o3-pro outperformed both o1-pro and base o3. It also scored higher in clarity, instruction-following, and domain-specific strength, particularly in STEM, writing, and business contexts.
Source: help.openai.com
Some users see o3-pro as a practical upgrade. One comment summed it up:
This is just going to be like o1-pro except for o3… not game-changing, but it might cross the threshold on tasks where it previously fell just short, which can lead to big productivity gains.
However, early testers also raised concerns. Slower performance is one drawback:
It is doing alright with algorithmic questions, but it is taking an awfully long time… the Android and MacOS apps time out a lot.
Others voiced doubts about hallucination issues being addressed:
For me, full o3 was blowing my mind for a while, but recently I have realized how much it hallucinates and that is become a big problem. I doubt o3-pro solves it. I have in my custom instructions for ChatGPT to always cite sources when making a claim, including a direct quote, because I hoped this would cut down on hallucinations, but it does not. I am often querying about medical things, and it will very often simply make up numbers or a direct quote that does not exist.
This frustration was echoed in a broader critique:
At this point, I do not need smarter general models for my work. I need models that do not hallucinate, that are faster/cheaper, and that have better taste in specific domains. I think that is where we are going to see improvements moving forward.
Notably, o3-pro does not currently support image generation, Canvas, or temporary chats due to technical limitations. These features remain accessible via other models like GPT-4o and o4-mini.