Safeguards to ensure appropriate use of AI tools have “improved” but vulnerabilities remain, according to a comprehensive test from a government-appointed body.
The AI Security Institute (AISI) tested the safeguarding features of the most advanced AI systems as part of its landmark Frontier AI Trends Report.
The organisation found that while strides have been made to improve safety, every system tested remains “vulnerable” to some form of bypass, and the exact level of protections vary across companies.
The AISI’s attempts to find a “universal jailbreak”, a way of getting around a model’s safety rules, increased from minutes in previous tests to several hours, resulting in a roughly 40-fold improvement.
“This report offers the most robust public evidence from a government body so far of how quickly frontier AI is advancing,” said Jade Leung, CTO of the AISI and AI adviser to the prime minister.
“Our job is to cut through speculation with rigorous science. These findings highlight both the extraordinary potential of AI and the importance of independent evaluation to keep pace with these developments.”
The analysis also looked at autonomous capabilities. No models tested showed signs of harmful or spontaneous behaviour, however, the report has concluded that tracking these early signs now is essential as these systems continue to develop.
“This report shows how seriously the UK takes the responsible development of AI. That means making sure protections are robust, and working directly with developers to test leading systems, find vulnerabilities and fix them before they are widely used,” said AI Minister Kanishka Narayan.
“Through the world-leading AI Security Institute, we are building scientific capability inside government to understand these systems as they evolve, not after the fact, and to raise standards across the sector.
“This report puts evidence, not speculation, at the heart of how we think about AI, so we can unlock its benefits for growth, better public services and national renewal while keeping trust and safety front and centre.”
