Artificial intelligence research is progressing today at an impressive pace. More and more industry players are now expecting to see the first General artificial intelligence (IAG)with reasoning capacities higher than those of humans, in a few years. A perspective as exciting as concerning, and for good reason: experts have long considered that such a system could Sow an unprecedented discord in our civilization.
It is a theme that has often been explored by fictional authors in works like 2001: The Odyssey of Space, Terminatoror Matrix To name a few. But as hardying they may be, these scenarios are obviously quite caricatured.
If a very advanced artificial intelligence one day begins to cause harm to humanity, it could also do it in a more subtle and less extreme way. To avoid a possible disaster, a set of solid guidelines should now be established. And this is precisely the subject of the last technical paper of Deepmind, spotted by Ars Technica.
For those who are not familiar, this Google branch is one of the most advanced companies in the industry. From game theory (alphazero, alphago, etc.) to structural biology (alphafold), including weather forecasts (Gencast) or nuclear fusion, it has developed many AI -based systems to tackle problems that once seemed completely unaffordable.
Recently, his researchers have published a long paper that explores different approaches to limit the risks associated with the development of an IAG. It focuses in particular on the different types of risks associated with such a system. In total, the authors identified four main categories.
A weapon for ill -intentioned humans
The first concerns what Deepmind describes as ” abuse ». In this case, it is not the system itself that is directly problematic, but humans who pilot it. It seems obvious that as a tool as powerful as an IAG could cause big damage if it fell into the hands of ill -intentioned actors. The latter could, for example, ask him to exploit critical cybersecurity vulnerabilities, especially in the level of critical infrastructure such as nuclear power plants, to create formidable bacteriological weapons, and so on.
Deepmind therefore believes that companies must imperatively set up extremely solid validation and security protocols, now. The authors also insist on the importance of developing techniques that allow IA models to be forced to ” forget »Data, in order to be able to cut the grass under the foot in an emergency if a disaster scenario begins to point the tip of its nose.
Alignment errors
The second category brings together all the problems related to what is called thealignment – ensuring that these models “understand” the values and expectations of humans, and give them priority when acting. An ill -aligned system, on the other hand, could therefore undertake actions of which he knows that they do not correspond to the vision of their creator.
This is the case which most often returns in fiction. For example, if HAL 9000 tries to eliminate the crew from the vessel in 2001: The Odyssey of Spaceit is because he considers that the success of the mission has more value than human life. Same observation for Skynet in the saga Terminator : While he was initially designed to defend humanity, he finally concluded that it was a threat that deserved to be eradicated at all costs.

To avoid this kind of scenario, Deepmind offers an interesting first solution: Make the IAGs work in pairs. Instead of evolving alone, they would be constantly supervised by a strictly identical clone, thus reducing the risks of drift. But the authors of the paper recognize that this approach would undoubtedly be far from being infallible.
In parallel, they therefore recommend running the future IAG in “Virtual sandboxes”. This term designates digital spaces isolated from the rest of the system, which are today mainly used in the field of cybersecurity to test sensitive programs without risking compromising the rest of the infrastructure. In theory, if a problem arises, it is therefore enough to deactivate this sandbox To deprive IAG of its ability to harm. However, one wonders if such a cunning system could not find a way to escape it …
When AI loses the pedals
The third category, entitled “Errors”, May seem quite similar to alignment problems. But it is however based on a crucial distinction: here, the IA model is not aware of the harmful consequences of his actions. He thinks he is completely next to the plate, like when the functionality AI Overview from Google has recommended Internet users to Put glue on their pizzas To prevent melted cheese from sliding.
This example ready to smile, but you can easily imagine situations where such errors (sometimes called hallucinations) could have terrible consequences. Imagine, for example, that a military vocation is to detect the warning signs of a nuclear strike; It could then trigger completely unjustified “reprisals”, leading to the total annihilation of part of the globe on the basis of a simple error.
The bad news is that there is not really a generalized approach to limit these errors. For paper authors, it will therefore be crucial to Deploy future IAG graduallywith rigorous tests at each stageand above all limit their ability to act independently.
Large -scale structural risks
The last category, and perhaps the most interesting, brings together what Deepmind calls “structural risks”. Here, the problem would not emerge from a single isolated system, but interactions between several complex systems integrated at different levels of our society.
Together, these interactive systems could “accumulate an increasingly important control over our economic and political systems”, On the circulation of information, and so on. In other words, IAG would eventually take control of our whole society, while humans would only be insignificant organic pawns on a vast virtual chessboard. A dystopian scenario that is definitely cold in the back.
Researchers point out that this risk category will undoubtedly be the most difficult to counter, because the potential consequences depend directly on how people, infrastructure and institutions work and interact.
Currently, no one knows exactly when – or even if – a real IAG will actually end up emerging. But in the current context, it would be imprudent not to take this possibility seriously. It will therefore be interesting to see if Openai and others will base their future work on this paper certainly abstract, but nevertheless very interesting.
🟣 To not miss any news on the Geek newspaper, subscribe to Google News and on our WhatsApp. And if you love us, .