By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
World of SoftwareWorld of SoftwareWorld of Software
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Search
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
Reading: GNU C Library Sees Up To 12.9x Improvement With New Generic FMA Implementation
Share
Sign In
Notification Show More
Font ResizerAa
World of SoftwareWorld of Software
Font ResizerAa
  • Software
  • Mobile
  • Computing
  • Gadget
  • Gaming
  • Videos
Search
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Have an existing account? Sign In
Follow US
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
World of Software > Computing > GNU C Library Sees Up To 12.9x Improvement With New Generic FMA Implementation
Computing

GNU C Library Sees Up To 12.9x Improvement With New Generic FMA Implementation

News Room
Last updated: 2025/11/27 at 6:38 AM
News Room Published 27 November 2025
Share
GNU C Library Sees Up To 12.9x Improvement With New Generic FMA Implementation
SHARE

Happy Holidays: 21+ years in providing Linux hardware reviews with more than 5,500 original reviews / featured multi-page articles and more than 48,400 original news articles. 99% of the content written by one individual. If you enjoy Phoronix.com, consider joining Phoronix Premium. This week is the Cyber Week promotion to help support all of our Linux/open-source hardware and software operations. Thank you for your consideration and support this holiday season.

Just a few days ago I wrote about the Glibc math code seeing a 4x improvement on AMD Zen by changing the used FMA implementation. Merged overnight was a new generic FMA implementation for the GNU C Library and now yielding up to a 12.9x throughput improvement on AMD Zen 3.

Adhemerval Zanella contributed this new generic FMA implementation to the GNU C Library. Zanella explained in the patch landing this new generic Fused Multiply Add (FMA) implementation:

“The current implementation relies on setting the rounding mode for different calculations (first to FE_TONEAREST and then to FE_TOWARDZERO) to obtain correctly rounded results. For most CPUs, this adds a significant performance overhead since it requires executing a typically slow instruction (to get/set the floating-point status), it necessitates flushing the pipeline, and breaks some compiler assumptions/optimizations.

This patch introduces a new implementation originally written by Szabolcs for musl, which utilizes mostly integer arithmetic. Floating-point arithmetic is used to raise the expected exceptions, without the need for fenv.h operations.

I added some changes compared to the original code:

* Fixed some signaling NaN issues when the 3-argument is NaN.

* Use math_uint128.h for the 64-bit multiplication operation. It allows the compiler to use 128-bit types where available, which enables some optimizations on certain targets (for instance, MIPS64).

* Fixed an arm32 issue where the libgcc routine might not respect the rounding mode. This can also be used on other targets to optimize the conversion from int64_t to double.

* Use -fexcess-precision=standard on i686.”

This new musl libc based implementation is showing some “large improvements” with tests carried out by Adhemerval Zanella:

New FMA implementation benchmarks

In another commit, Adhemerval Zanella summed up the recent math improvements made for Glibc 2.43 as:

“* Additional optimized and correctly rounded mathematical functions have been imported from the CORE-MATH project, in particular acosh, asinh, atanh, erf, erfc, lgamma, and tgamma.

* Optimized implementations for remainder, remaindef, frexpf, frexp, frexpl (binary128), and frexpl (intel96) have been added.

* The SVID handling for acosf, acoshf, asinhf, atan2f, atanhf, coshf, lgammaf/lgammaf_r, log10f, sinhf, sqrtf, tgammaf, y0/j0, y1/j1, and yn/jn were moved to compat symbols, allowing improvements in performance.”

Look for these improvements and more with Glibc 2.43 due for release in February.

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.
By signing up, you agree to our Terms of Use and acknowledge the data practices in our Privacy Policy. You may unsubscribe at any time.
Share This Article
Facebook Twitter Email Print
Share
What do you think?
Love0
Sad0
Happy0
Sleepy0
Angry0
Dead0
Wink0
Previous Article Virgin unlocks premium channel for all customers in FREE blockbuster TV upgrade Virgin unlocks premium channel for all customers in FREE blockbuster TV upgrade
Next Article This Android AirTag competitor just saved my luggage, and it’s down to This Android AirTag competitor just saved my luggage, and it’s down to $24
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Stay Connected

248.1k Like
69.1k Follow
134k Pin
54.3k Follow

Latest News

Don’t buy new tech this Black Friday: expert tips for buying refurbished phones and laptops
Don’t buy new tech this Black Friday: expert tips for buying refurbished phones and laptops
News
Conversational Analytics: the Next Generation of Data Analysis and Business Intelligence | HackerNoon
Conversational Analytics: the Next Generation of Data Analysis and Business Intelligence | HackerNoon
Computing
Google’s fix for the worst part of setting up a new Android phone is actually working
Google’s fix for the worst part of setting up a new Android phone is actually working
News
AI explodes the memory market
AI explodes the memory market
Mobile

You Might also Like

Conversational Analytics: the Next Generation of Data Analysis and Business Intelligence | HackerNoon
Computing

Conversational Analytics: the Next Generation of Data Analysis and Business Intelligence | HackerNoon

8 Min Read
Flip the Script: Write the Tests, Let AI Write the Implementation | HackerNoon
Computing

Flip the Script: Write the Tests, Let AI Write the Implementation | HackerNoon

8 Min Read
The “Comment” Fallacy: Why Self-Documenting Code is Still the Goal | HackerNoon
Computing

The “Comment” Fallacy: Why Self-Documenting Code is Still the Goal | HackerNoon

8 Min Read
Our Repo Got More Than 100 GitHub Stars From Compromised Accounts | HackerNoon
Computing

Our Repo Got More Than 100 GitHub Stars From Compromised Accounts | HackerNoon

3 Min Read
//

World of Software is your one-stop website for the latest tech news and updates, follow us now to get the news that matters to you.

Quick Link

  • Privacy Policy
  • Terms of use
  • Advertise
  • Contact

Topics

  • Computing
  • Software
  • Press Release
  • Trending

Sign Up for Our Newsletter

Subscribe to our newsletter to get our newest articles instantly!

World of SoftwareWorld of Software
Follow US
Copyright © All Rights Reserved. World of Software.
Welcome Back!

Sign in to your account

Lost your password?