A new white paper out today from Microsoft Corp.’s AI red team details findings around the safety and security challenges posed by generative artificial intelligence systems and stategices to address emerging risks.
Microsoft’s AI red team was established in 2018 to address the evolving landscape of AI safety and security risks. The team focuses on identifying and mitigating vulnerabilities by combining traditional security practices with responsible AI efforts.
The new whitepaper, titled “Lessons from Red Teaming 100 Generative AI Products,” found that generative AI amplifies existing security risks by introducing new vulnerabilities that require a multifaceted approach to risk mitigation. The paper emphasizes the importance of human expertise, continuous testing and collaboration to address challenges ranging from traditional cybersecurity flaws to novel AI-specific threats.
The report details three main takeaways, starting with generative AI systems amplify existing security risks and new ones. The report finds that generative AI models introduce novel cyberattack vectors while amplifying existing vulnerabilities.
Within generative AI, traditional security risks, such as outdated software components or improper error handling, were found to remain critical concerns, but in addition, model-level weaknesses, such as prompt injections, create unique challenges in AI systems.
In a case study, an outdated FFmpeg component in a video-processing AI app found by the red team allowed a server-side request forgery attack, demonstrating how legacy issues persist in AI-powered solutions. “AI red teams should be attuned to new cyberattack vectors while remaining vigilant for existing security risks,” the report states. “AI security best practices should include basic cyber hygiene.”
The second takeaway — humans are at the center of improving and securing AI — found that though automation tools are useful for creating prompts, orchestrating cyberattacks and scoring responses, red teaming can’t be automated entirely and that AI red teaming relies heavily on human expertise.
The white paper argues that subject matter experts play a crucial role in AI red teaming by evaluating content in specialized fields such as medicine, cybersecurity and chemical, biological, radiological and nuclear contexts, where automation often falls short. Though it’s noted that language models can identify generic risks like hate speech or explicit content, they struggle to assess nuanced domain-specific issues, making human oversight vital in ensuring comprehensive risk evaluations.
AI models trained predominantly on English-language data were often found to fail to capture risks and sensitivities in diverse linguistic or cultural settings. Similarly, probing for psychosocial harms, such as a chatbot’s interaction with users in distress, was found to require human judgment to understand the broader implications and potential impacts of such engagements.
The third takeaway — defense in depth is key for keeping AI systems safe — found that mitigating risks in generative AI requires a layered approach that combines continuous testing, robust defenses and adaptive strategies.
The paper notes that though mitigations can reduce vulnerabilities, they cannot eliminate risks entirely, making ongoing red teaming a critical component in strengthening AI systems. The Microsoft researchers state that by repeatedly identifying and addressing vulnerabilities, organizations can increase the cost of attacks and, in doing so, deter adversaries and raise the overall security posture of their AI systems.
Image: News/Flux-1
Your vote of support is important to us and it helps us keep the content FREE.
One click below supports our mission to provide free, deep, and relevant content.
Join our community on YouTube
Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.
THANK YOU