The Latest OpenAI Models Invent More Of The Account

The latest OpenAI models invent more of the account

Last updated: 2025/04/21 at 11:31 PM

News Room Published 21 April 2025

Artificial intelligence (AI) is not evolving: it is taking off. In just two and a half years we have passed from GPT-3.5 to GPT-4O, and whoever tried both knows: the difference in the conversation experience is enormous. GPT-3.5 marked a before and after when inaugurating the chatgpt era, but today nobody would probably use it again if it has more advanced models.

Now, what does it mean that a model is more advanced? The answer is complex. We talk about broader context windows (that is, the ability to read and process more information at the same time), more elaborate results and, in theory, of fewer errors. But there is a point that is still thorny: hallucinations. And do not always advance in the right direction.

What are hallucinations? In AI, hallucinating means inventing things. They are answers that sound good, even convincing, but that are false. The model does not lie because it wants, it simply generates text depending on patterns. If you don’t have enough data, you imagine them. And that can go unnoticed. There is the risk.

O3 and O4-mini: more reasoning, more errors. In September last year the so -called reasoning models arrived. They supposed an important leap: they introduced a kind of chain of thought that improved their performance in complex tasks. But they were not perfect. O1-PRO was more expensive than O3-mini, and not always more effective. Even so, this whole line was presented with a promise: reduce hallucinations.

The problem is that, according to Openai’s own data, that is not happening. Techcrunch cites a technical report of the company where it is recognized that O3 and O4-MINI hallucinate more than its predecessors. Literally. In internal tests with Personqa, O3 failed in 33% of the answers, twice as O1 and O3-mini. O4-mini made it even worse: 48%.

Other analyzes, such as the Independent Laboratory Transluce, show that O3 even invented actions: he said he had executed code in a MacBook Pro outside of Chatgpt and then having copied the results. Something that simply cannot do.

The US gives Huawei a great opportunity: to get its new chip for AI with the Nvidia market in China

A challenge that is still pending. The idea of having models that do not hallucinate sounds fantastic. It would be the definitive step to fully trust your answers. But, meanwhile, it’s time to live with this problem. Especially when we use AI for delicate tasks: summarize documents, consult data, prepare reports. In those cases, it should be reviewed all twice.

Because there have already been serious errors. The most popular was that of a lawyer who presented to the judge documents generated by Chatgpt. They were convincing, yes, but also fictional: the model invented several legal cases. The AI will advance, but the critical judgment, for the moment, remains our thing.

Images | WorldOfSoftware with chatgpt | OpenAI

In WorldOfSoftware | Some users are using OPENAI O3 and O4-Mini to find out the location of photos: it is a nightmare for privacy

In WorldOfSoftware | If you’ve ever been afraid of chasing you a robot, China has organized a half marathon to breathe calm

The latest OpenAI models invent more of the account

Leave a Reply Cancel reply

Stay Connected

Latest News

Costco fans rave over $13.99 new drink – it’s cheaper than Walmart Bud Light

After hardware team layoffs, Thin Red Line resumes hiring for AI roles · TechNode

Last chance to bag rare Apple TV+ deal for £3 as deadline days from vanishing

Buy the new Renault 5 Turbo Electric or a Porsche 911

World of Software is your one-stop website for the latest tech news and updates, follow us now to get the news that matters to you.

Quick Link

Topics

Sign Up for Our Newsletter

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.

Leave a Reply Cancel reply

Stay Connected

Latest News