Real-World Code Performance: Multi-Token Finetuning On CodeContests

Real-World Code Performance: Multi-Token Finetuning on CodeContests | HackerNoon

Last updated: 2025/07/22 at 12:12 PM

News Room Published 22 July 2025

Table of Links

Abstract and 1. Introduction

2. Method

3. Experiments on real data

4. Ablations on synthetic data

5. Why does it work? Some speculation

6. Related work

7. Conclusion, Impact statement, Environmental impact, Acknowledgements and References

A. Additional results on self-speculative decoding

B. Alternative architectures

C. Training speeds

D. Finetuning

E. Additional results on model scaling behavior

F. Details on CodeContests finetuning

G. Additional results on natural language benchmarks

H. Additional results on abstractive text summarization

I. Additional results on mathematical reasoning in natural language

J. Additional results on induction learning

K. Additional results on algorithmic reasoning

L. Additional intuitions on multi-token prediction

M. Training hyperparameters

F. Details on CodeContests finetuning

We use the Python subset of the CodeContests (Li et al., 2022) train split with reward annotations (“correct” / “incorrect”) and condition on correct solutions at evaluation time. For evaluation, we generate 1000 samples per problem from the test split for each temperature T ∈ {0.5, 0.6, 0.7, 0.8, 0.9}, and compute the unbiased estimator for pass@k from Chen et al. (2021) for each value of k and T. It is possible that models that were pretrained with different losses have different respective optimal temperatures for pass@k, so we compute and show k 7→ maxT pass_at(k, T) in Figure 4. In other words, we grant pass@k access to a temperature oracle. For small values of k, pass@k measures the capability of understanding and solving tasks while for large k, it additionally favors diversity in outputs. According to the results in Figure 4, multi-token prediction pretraining leads to finetuned models that are better on both axes.

Authors:

(1) Fabian Gloeckle, FAIR at Meta, CERMICS Ecole des Ponts ParisTech and Equal contribution;

(2) Badr Youbi Idrissi, FAIR at Meta, LISN Université Paris-Saclayand and Equal contribution;

(3) Baptiste Rozière, FAIR at Meta;

(4) David Lopez-Paz, FAIR at Meta and a last author;

(5) Gabriel Synnaeve, FAIR at Meta and a last author.

Real-World Code Performance: Multi-Token Finetuning on CodeContests | HackerNoon

Table of Links

F. Details on CodeContests finetuning

Leave a Reply Cancel reply

Stay Connected

Latest News

YouTube Music’s new features will help you get exclusives from your favorite creators

Metzl featured on Fox News on antisemitism, student visa policy, and US competitiveness

up to 50% less traffic

What is this animated film that panics Cannes, Annecy and the Oscars?

World of Software is your one-stop website for the latest tech news and updates, follow us now to get the news that matters to you.

Quick Link

Topics

Sign Up for Our Newsletter

Table of Links

F. Details on CodeContests finetuning

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.

Leave a Reply Cancel reply

Stay Connected

Latest News