OpenAI believes its data was used to train DeepSeek’s R1 large language model, multiple publications reported today.
DeepSeek is a Chinese artificial intelligence provider that develops open-source LLMs. R1, the latest addition to the company’s model lineup, debuted last week. The release of the LLM caused a broad selloff in AI stocks that sent Nvidia Corp.’s shares plummeting 17% on Monday, along with many other technology stocks.
LLMs such as R1 are developed using large amounts of training data. OpenAI told the Financial Times today that it believes some of this data was generated by its LLMs. That would breach OpenAI’s terms of service, which prohibit customers from using its LLMs to train competing models.
According to Bloomberg, the potential data misuse was first spotted by Microsoft Corp. in the fall. The company’s cybersecurity researchers determined that “individuals they believe may be linked to DeepSeek” downloaded large amounts of data via OpenAI’s application programming interface. Microsoft notified the ChatGPT developer, which subsequently blocked the API access of the users in question.
OpenAI said in a statement that it believes R1 may have been trained using a method known as distillation.
With distillation, developers enter prompts into an LLM and use its output to train a new model. This process transfers the first LLM’s knowledge to the second. Training models in this manner can significantly reduce the costs associated with AI projects.
In a research paper, DeepSeek researchers stated that R1 took $5.6 million worth of graphics processing unit hours to train. It’s believed this sum doesn’t include the GPUs’ purchase price and certain model development expenses. Nevertheless, Nvidia’s shares dropped 17% on concerns that R1 and future LLMs may not require as many graphics cards to run as previously expected.
R1 is based on the industry-standard transformer LLM design and uses a processing approach called mixture of experts, or MoE. Standard LLMs activate all their parameters when they process a prompt. MoE models like R1 activate only a small fraction of its parameters, which significantly reduces their hardware usage and thereby lowers inference costs.
R1 was trained using a different approach than many earlier transformer-based LLMs.
Usually, researchers train LLMs with two techniques called reinforcement learning and supervised fine-tuning. While developing R1, DeepSeek scaled back its use of the latter method and mainly used reinforcement learning to equip the model with reasoning skills.
In reinforcement learning projects, developers give an LLM a set of training tasks similar to the ones it will perform in production. When the model completes a training task successfully, it receives points. Incorrect responses cause the LLM to lose points. This feedback enables the model to gradually learn the correct way of completing the tasks it receives.
DeepSeek optimized R1 for reasoning tasks such as generating code and solving math problems. OpenAI offers its own reasoning-optimized LLM series headlined by o3, a model it previewed last month. In internal tests, o3 set records across several of the world’s most difficult AI benchmarks.
“As the leading builder of AI, we engage in countermeasures to protect our IP, including a careful process for which frontier capabilities to include in released models,” OpenAI said in a statement. “We believe as we go forward that it is critically important that we are working closely with the U.S. government to best protect the most capable models from efforts by adversaries and competitors to take U.S. technology.”
Image: Unsplash
Your vote of support is important to us and it helps us keep the content FREE.
One click below supports our mission to provide free, deep, and relevant content.
Join our community on YouTube
Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.
THANK YOU