OpenAI is examining whether Chinese artificial intelligence (AI) startup DeepSeek improperly obtained data from its models to build a popular new AI assistant, a spokesperson confirmed to The Hill.
The ChatGPT maker said it is “reviewing indications that DeepSeek may have inappropriately distilled” its models. Distillation is a technique used to transfer the knowledge of a large model to a smaller model.
“We know that groups in the [People’s Republic of China] are actively working to use methods, including what’s known as distillation, to try to replicate advanced U.S. AI models,” the spokesperson said in a statement.
“We take aggressive, proactive countermeasures to protect our technology and will continue working closely with the U.S. government to protect the most capable models being built here,” they added.
Distillation does not expose a model’s inner workings and can be used by developers to improve their applications, the spokesperson noted. However, OpenAI’s terms of service bar users from using the data obtained through distillation to build competing AI products.
DeepSeek sent shock waves through the American AI industry with the release of its R1 open-source reasoning model last week.
The Chinese startup claims its model performs on par with OpenAI’s latest model and cost just $5.6 million to train with a couple thousand reduced-capacity chips. DeepSeek now sits atop Apple’s App Store after overtaking OpenAI’s ChatGPT.
Microsoft, a close partner of OpenAI, is reportedly also investigating the issue after its researchers noticed individuals potentially linked to DeepSeek extracting large amounts of data from the AI firm’s application programming interface last fall, according to Bloomberg.
White House AI and crypto czar David Sacks claimed Tuesday that there is “substantial evidence” that DeepSeek used distillation to pull information from OpenAI’s models.
“I don’t think OpenAI is very happy about this,” he told Fox News. “I think one of the things you’re going to see over the next few months is our leading AI companies taking steps to try and prevent distillation.”
Commerce secretary nominee Howard Lutnick also accused DeepSeek of ripping off U.S. tech firms and violating U.S. export bans on chips to build its model.
“We need to drive our innovation and we need to stop helping them. You know, open platforms — Meta’s open platform let DeepSeek rely on it. Nvidia’s chips, which they bought tons of, and they found their ways around, drive their DeepSeek model. It’s got to end,” Lutnick told the Senate Commerce Committee during his Wednesday confirmation hearing.