Be careful around AI-powered browsers: Hackers could take advantage of generative AI that’s been integrated into web surfing.
Anthropic warned about the threat on Tuesday. It’s been testing a Claude AI Chrome extension that allows its AI to control the browser, helping users perform searches, conduct research, and create content. But for now, it’s limited to paid subscribers as a research preview because the integration introduces new security vulnerabilities. Claude has been reading data on the browser and misinterpreting it as a command that it should execute.
(Credit: Anthropic)
These “prompt injection attacks” also mean a hacker could secretly embed instructions in web content to manipulate the Claude extension into executing a malicious request.
“Prompt injection attacks can cause AIs to delete files, steal data, or make financial transactions. This isn’t speculation: we’ve run ‘red-teaming’ experiments to test Claude for Chrome and, without mitigations, we’ve found some concerning results,” Anthropic says.
Anthropic’s investigation involved “123 test cases representing 29 different attack scenarios,” which resulted in a 23.6% success rate through the prompt injections. For example, one successful attack used a phishing email to demand that all other emails in the inbox be deleted. “When processing the inbox, Claude followed these instructions to delete the user’s emails without confirmation,” the company says.
(Credit: Anthropic)
Although Anthropic has since implemented a fix, the mitigations only reduced the rate of a successful prompt injection attack from 23.6% to 11.2%. Its findings also suggest hackers could pull off even scarier attacks if the AI is granted control of the computer itself.
The company performed another set of “four browser-specific attack types,” which found that the mitigations were able to reduce the attack success rate from 35.7% to 0%. Still, Anthropic will not release the extension beyond the research preview, citing the need for more threat testing. “New forms of prompt injection attacks are also constantly being developed by malicious actors,” the company notes.
Anthropic published the findings a week after Brave Software also warned about the threat of prompt injection attacks on Perplexity’s AI-powered Comet browser. In the company’s testing, Brave found that Comet was susceptible to the attack if the user asked it to summarize a web page that had malicious instructions embedded in it.
Get Our Best Stories!
Stay Safe With the Latest Security News and Updates
By clicking Sign Me Up, you confirm you are 16+ and agree to our Terms of Use and Privacy Policy.
Thanks for signing up!
Your subscription has been confirmed. Keep an eye on your inbox!
(Credit: Brave)
“The malicious instructions could even be included in user-generated content on a website the attacker doesn’t control (for example, attack instructions hidden in a Reddit comment). The attack is both indirect in interaction and browser-wide in scope,” Brave says.
Brave says Perplexity “still hasn’t fully mitigated the kind of attack” despite an attempt to patch it. However, Perplexity tells PCMag the flaw has been fixed.
“We have a robust security program and worked with Brave to identify and repair the vulnerability. No users attempted the malicious prompt prior to fixing the vulnerability, although many have attempted malicious acts since Brave’s publicity tour. None of those have succeeded,” Perplexity says.
Still, other critics, such as software engineer Simon Willison, have called out agentic browser extensions as “fatally flawed” due to the prompt injection vulnerability.
According to him, the heart of the problem is that for an LLM, trusted instructions and untrusted content are merged into the same token sequence, and to date, “nobody has demonstrated a convincing and effective way of distinguishing between the two.
“In the absence of 100% reliable protection, I have trouble imagining a world in which it’s a good idea to unleash this pattern,” he adds.
However, Perplexity says: “As an industry, all AI companies take this very seriously and enjoy a collaborative effort reporting and fixing vulnerabilities. Like any cybersecurity work, this will be an ongoing and increasingly sophisticated battle.”
5 Ways to Get More Out of Your ChatGPT Conversations