A team of researchers has invented a new approach to developing language models that can improve the privacy of training data.
The researchers, who work at the Allen Institute for Artificial Intelligence, detailed the technology on Wednesday. They’ve named it FlexOlmo.
The better the dataset with which a neural network is trained, the better its output. One way to increase the quality of a training dataset is to augment it with information from multiple organizations. Two medical research institutes, for example, could pool clinical records in a single repository and use it to power a joint AI training project.
In practice, such data sharing isn’t always possible. Regulatory restrictions and cybersecurity challenges often make it impractical to move training data outside a company’s network. FlexOlmo is designed to address that limitation.
According to the technology’s developers, it allows multiple companies to jointly train an AI model without making their respective datasets accessible to one another. FlexOlmo “achieves performance very close” to AI models trained using a single, unified dataset, the researchers wrote in a blog post.
The starting point of a FlexOlmo project is a so-called anchor AI model. Every organization that participates in the project creates its own copy of the anchor AI model and trains it on its internal data. Then, the customized models produced through this process are combined into a single algorithm.
“This design allows data owners to contribute asynchronously without sharing their data,” the researchers explained in a paper.
An AI that comprises multiple neural networks is known as a MoE, or mixture of experts, model. Such models include a component known as a router. When a user enters a prompt, the router determines which of the MoE model’s neural networks is best suited to generate an answer.
Training a MoE model’s neural networks on different datasets, the approach taken by FlexOlmo, can decrease the performance of its router. To address that issue, the technology assigns each of the neural networks its own router. When the algorithms are merged into a single MoE model, their routers are merged as well. This arrangement avoids the technical issues that could otherwise emerge.
The researchers tested whether a hacker could extract training data from a FlexOlmo model’s constituent neural networks. “Our analysis found a low extraction rate of 0.7%,” they wrote. “For comparison, a model overfitted on a small math subset for 100 epochs yielded a 60% extraction rate.”
To evaluate FlexOlmo’s practical applications, the researchers used it to train several AI models with up to 37 billion parameters. During testing, they determined the technology facilities 10.1% better model performance than earlier approaches to merging neural networks.
Image: Allen Institute for Artificial Intelligence
Support our open free content by sharing and engaging with our content and community.
Join theCUBE Alumni Trust Network
Where Technology Leaders Connect, Share Intelligence & Create Opportunities
11.4k+
CUBE Alumni Network
C-level and Technical
Domain Experts
Connect with 11,413+ industry leaders from our network of tech and business leaders forming a unique trusted network effect.
News Media is a recognized leader in digital media innovation serving innovative audiences and brands, bringing together cutting-edge technology, influential content, strategic insights and real-time audience engagement. As the parent company of News, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — such as those established in Silicon Valley and the New York Stock Exchange (NYSE) — News Media operates at the intersection of media, technology, and AI. .
Founded by tech visionaries John Furrier and Dave Vellante, News Media has built a powerful ecosystem of industry-leading digital media brands, with a reach of 15+ million elite tech professionals. The company’s new, proprietary theCUBE AI Video cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.