The Impact Of Parameters On LLM Performance | HackerNoon

The Impact of Parameters on LLM Performance | HackerNoon

Last updated: 2025/03/07 at 1:22 AM

News Room Published 7 March 2025

Authors:

(1) Wanyun Cui, Shanghai University of Finance and Economics, with equal contribution;

(2) Qianle Wang, Shanghai University of Finance and Economics, with equal contribution.

Table of Links

Abstract and 1 Introduction

2 Related Work

3 Quantifying the Impact of Parameters on Model Performance & 4. Unified Mixed-Precision Training

5 Prevalence of Parameter Heterogeneity in LLMs

6 Quantization Experiments and 6.1 Implementation Details

6.2 Effect of Base LLM Quantization

6.3 Effect of Chat LLM Quantization

6.4 Comparison of Parameter Selection Criteria, Conclusion, & References

3. Quantifying the Impact of Parameters on Model Performance

4. Unified Mixed-Precision Training

The insights gained from Figure 1 highlights the heterogeneity in model parameters. The cherry parameters, despite constituting less than 1% of the total parameter count, exert a substantial influence on the model. Indiscriminately quantizing these cherry parameters alongside the normal parameters may lead to a significant deterioration in model performance.

To mitigate the impact of cherry parameters on quantization, we propose to preserve their high-precision values during the quantization process. By maintaining the fidelity of these critical parameters, we ensure that the essential information they capture is not compromised.

Optimizing mixed-precision parameters in LLMs presents a unique challenge. The widely adopted GPTQ approach [8], which falls under the Post-Training Quantization (PTQ) framework [14], struggles to simultaneously optimize high-precision cherry parameters and low-precision normal parameters. This is because updating the cherry parameters during the PTQ process significantly affects the model, causing the optimal values of the normal parameters to vary. However, in the PTQ framework, once the parameters are quantized, they cannot be updated further. This limitation prevents the early-stage quantized parameters from reaching their optimal values. On the other hand, if we do not allow the updates of the cherry parameters during the PTQ process [17], the quantized model will lose the flexibility provided by these critical parameters.

To address this challenge, we propose a novel approach that unifies the optimization of mixed-precision parameters. Our method leverages a QAT framework, which allows for the simultaneous optimization of both cherry parameters and normal parameters. During backpropagation, the high-precision cherry parameters are updated using standard gradient descent, while the low-precision normal parameters employ the Straight-Through Estimator (STE) trick [3] for low precision gradient descent. This unified backpropagation enables end-to-end optimization of both cherry parameters and normal parameters, enhancing the overall optimization effect. We show the quantization in Algorithm 1.

The Impact of Parameters on LLM Performance | HackerNoon

Table of Links

3. Quantifying the Impact of Parameters on Model Performance

4. Unified Mixed-Precision Training

Leave a Reply Cancel reply

Stay Connected

Latest News

Amazon claims its ‘constantly inviting’ new customers to Alexa Plus

The Best Cheap Web Hosting We’ve Tested (May 2025)

Change your coffee game with $60 off the Nespresso Vertuo Plus at Amazon

Five takeaways from IBM Think 2025 – News

World of Software is your one-stop website for the latest tech news and updates, follow us now to get the news that matters to you.

Quick Link

Topics

Sign Up for Our Newsletter

Table of Links

3. Quantifying the Impact of Parameters on Model Performance

4. Unified Mixed-Precision Training

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.

Leave a Reply Cancel reply

Stay Connected

Latest News