DeepSeek Open-sources Its R1 Reasoning Model Series - News

DeepSeek today released a new large language model family, the R1 series, that’s optimized for reasoning tasks.

The Chinese artificial intelligence developer has made the algorithms’ source-code available on Hugging Face.

The LLM lineup is headlined by two algorithms called R1 and R1-Zero. According to DeepSeek, the former model outperforms OpenAI’s o1 across several reasoning benchmarks. R1-Zero, meanwhile, is less capable but represents a potentially significant advancement in machine learning research.

Both LLMs feature a mixture of experts, or MoE, architecture with 671 billion parameters. A MoE model comprises multiple neural networks that are each optimized for a different set of tasks. When the model relieves a prompt, a mechanism known as a router sends the query to the neural network best-equipped to process it.

The main benefit of the MoE architecture is that it lowers inference costs. When users enter a prompt into an MoE model, the query doesn’t activate the entire AI but only the specific neural network that will generate the response. As a result, R1 and R1-Zero activate less than one tenth of their 671 billion parameters when answering prompts.

DeepSeek trained R1-Zero using a different approach than the one researchers usually take with reasoning models.

Reasoning-optimized LLMs are typically trained using two methods known as reinforcement learning and supervised fine-tuning. The former technique teaches an AI model to perform a task through trial and error. Supervised fine-tuning, in turn, boosts the AI’s output quality by providing it with examples of how to carry out the task at hand.

While training R1-Zero, DeepSeek skipped the supervised self-tuning stage. Nevertheless, the company managed to equip the model with reasoning skills such as the ability to break down complex tasks into simpler sub-steps.

“It is the first open research to validate that reasoning capabilities of LLMs can be incentivized purely through RL, without the need for SFT,” DeepSeek researchers detailed. “This breakthrough paves the way for future advancements in this area.”

Although R1-Zero has an advanced feature set, its output quality is limited. The model’s responses sometimes suffer from “endless repetition, poor readability and language mixing,” DeepSeek‘s researchers detailed. The company created R1 to address those limitations.

R1 is an enhanced version of R1-Zero that was developed using a modified training workflow. This workflow makes use of supervised fine-tuning, the technique that DeepSeek left out during the development of R1-Zero. The company says that this change helped significantly boost output quality.

DeepSeek compared R1 against four popular LLMs using nearly two dozen benchmark tests. According to the company, its model managed to outperform OpenAI’s reasoning-optimized o1 LLM across several of the benchmarks. In most of the benchmarks that o1 completed with a higher score, R1 trailed it by under 5%.

One of the benchmarks in which R1 outperformed o1 is LiveCodeBench. It’s a collection of programming tasks that is regularly updated with new practice problems. This makes it less likely that AI models will find ready-made answers to the problems on the public web.

Alongside R1 and R1-Zero, DeepSeek today open-sourced a set of less capable but more hardware-efficient models. Those models were “distilled” from R1, which means that some of the LLM’s knowledge was transferred to them during training.

The distilled models range in size from 1.5 billion to 70 billion parameters. They’re based on the Llama and Qwen open-source LLM families. DeepSeek says that one of the distilled models, R1-Distill-Qwen-32B, outperforms the scaled-down OpenAI-o1-mini version of o1 across several benchmarks.

Image: Unsplash

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU

DeepSeek open-sources its R1 reasoning model series – News

Image: Unsplash

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

Leave a Reply Cancel reply

Stay Connected

Latest News

Listen up: The premium Klipsch The One Plus speaker is nearly half off at Amazon

Amazon’s new AI agent will shop third-party sites for you | News

TikTok is shutting down its Instagram-like Notes app

From CEX To DEX: BYDFi Celebrates 5 Years Of Remarkable Growth | HackerNoon

World of Software is your one-stop website for the latest tech news and updates, follow us now to get the news that matters to you.

Quick Link

Topics

Sign Up for Our Newsletter

Image: Unsplash

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.

Leave a Reply Cancel reply

Stay Connected

Latest News