By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
World of SoftwareWorld of SoftwareWorld of Software
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Search
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
Reading: Agentica Project’s Open Source DeepCoder Model Outperforms OpenAI’s O1 on Coding Benchmarks
Share
Sign In
Notification Show More
Font ResizerAa
World of SoftwareWorld of Software
Font ResizerAa
  • Software
  • Mobile
  • Computing
  • Gadget
  • Gaming
  • Videos
Search
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Have an existing account? Sign In
Follow US
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
World of Software > News > Agentica Project’s Open Source DeepCoder Model Outperforms OpenAI’s O1 on Coding Benchmarks
News

Agentica Project’s Open Source DeepCoder Model Outperforms OpenAI’s O1 on Coding Benchmarks

News Room
Last updated: 2025/06/17 at 9:15 AM
News Room Published 17 June 2025
Share
SHARE

The Agentica Project and Together AI have released DeepCoder-14B-Preview, an open source AI coding model based on Deepseek-R1-Distilled-Qwen-14B. The model achieves a 60.6% pass rate on LiveCodeBench, outperforming OpenAI’s o1 model and matching the performance of o3-mini.

DeepCoder-14B-Preview is fine-tuned from the Deepseek model on a dataset of 24K coding problems using reinforcement learning (RL). The developers modified the verl distributed RL framework to improve the end-to-end training efficiency by 2x. They released all artifacts associated with creating the model: code, data, training logs, and their improvements to verl. They evaluated the model on several coding benchmarks, including LiveCodeBench, Codeforces, and HumanEval, and on the math benchmark AIME2024. DeepCoder showed strong performance on all of them, with scores “comparable” to or even better than closed source reasoning models such as o1 and o3-mini. According to the project team,

Our goal is to democratize RL training for LLMs…By fully sharing our dataset, code, and training recipe, we empower the community to reproduce our work and make RL training accessible to all. We believe advancing RL scaling is a collective, community-driven endeavor, and we welcome open-source contributions and sponsorships. Let’s work together to push the frontiers of RL for LLM reasoning—and beyond!

The DeepCoder team published several details about their training process and several problems they overcame. First was a lack of  “high-quality, verifiable” training data for coding problems: several popular datasets were “noisy or contained unverifiable problems,” or were just too easy for models to solve. To create a training dataset, the team developed an automated pipeline to keep only problems with a verifiable solution and at least five unit tests.

They also addressed an RL training bottleneck in “sampling,” i.e. running inference on the model being trained. The solution was to pipeline the process: run training and inference in parallel, and use the inference output for the next batch of training. This reduced the training iteration time by 1.4x.

LiveCodeBench Pass@1 Accuracy vs Model Size. Image Source: Together AI Blog

In a Reddit discussion about the model, one user wrote:

I just gave the q4 quant of the 14b version on ollama a try and I have to say that I’m very impressed. It’s definitely the best model I’ve tried in this size. I’d need more testing to conclude if it’s really as good as o3-mini low (particularly as I only have ever tested o3-mini medium), but it definitely feels like it’s beyond 4o in my initial testing on my day-to-day tasks.

Andrew Ng’s newsletter The Batch praised DeepCoder, saying:

Applying reinforcement learning to coding works, but it has two big issues: (i) Training examples of verifiable code are relatively scarce and (ii) computing reward signals for code is time-consuming, since it requires evaluating many test cases. DeepCoder-14B-Preview’s optimizations reduced this complexity, shrinking RL training from months to weeks. Those optimizations are built into Verl-pipeline, an open source RL library from Together.AI and Agentica, giving developers a powerful tool for model training.


Kudos to the DeepCoder team for open sourcing their reasoning recipe! A handful of companies have developed the know-how to execute RL well, but many teams still have trouble implementing successfully. Open recipes for RL training methods and data curation techniques are important to move the field forward.

The DeepCoder-14B-Preview training code is available on GitHub. Model files can be downloaded from Huggingface.

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.
By signing up, you agree to our Terms of Use and acknowledge the data practices in our Privacy Policy. You may unsubscribe at any time.
Share This Article
Facebook Twitter Email Print
Share
What do you think?
Love0
Sad0
Happy0
Sleepy0
Angry0
Dead0
Wink0
Previous Article How Does Repurposing Content Enhance SEO and Online Visibility?
Next Article How Find Influencers and Creators for Your Next Campaign
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Stay Connected

248.1k Like
69.1k Follow
134k Pin
54.3k Follow

Latest News

Major Audi dealer in China exits brand, switches to Huawei EVs · TechNode
Computing
Astroscale UK wins MoD contract to study space weather – UKTN
News
Philips Hue Line Gains New Wall Washer Light
News
Enso’s $5M Community Round: Are You Part of the Next Wave of Decentralized Innovation? | HackerNoon
Computing

You Might also Like

News

Astroscale UK wins MoD contract to study space weather – UKTN

2 Min Read
News

Philips Hue Line Gains New Wall Washer Light

6 Min Read
News

Who Do You Trust? Social Media Overtakes TV as Top Source of News

5 Min Read
News

Max will show autoplaying video previews suggested by AI

1 Min Read
//

World of Software is your one-stop website for the latest tech news and updates, follow us now to get the news that matters to you.

Quick Link

  • Privacy Policy
  • Terms of use
  • Advertise
  • Contact

Topics

  • Computing
  • Software
  • Press Release
  • Trending

Sign Up for Our Newsletter

Subscribe to our newsletter to get our newest articles instantly!

World of SoftwareWorld of Software
Follow US
Copyright © All Rights Reserved. World of Software.
Welcome Back!

Sign in to your account

Lost your password?