Linux Patches To Unconditionally Enable Architecture-Optimized BLAKE2s Support

Last updated: 2025/08/28 at 7:29 AM

News Room Published 28 August 2025

While Linus Torvalds doesn’t too often like new kernel options being enabled by default, one area where it has proven beneficial and otherwise an oversight by those configuring their own kernel builds is the architecture-optimized crypto algorithm implementations. Some will enable support for different kernel crypto algorithms only to forget or be unaware that there are CPU architecture specific implementations that can also typically be enabled for much better performance over the common code. Google engineer Eric Biggers has been cleaning this up and BLAKE2s is the latest receiving treatment.

Eric Biggers who has been known for his relentless crypto kernel optimizations over the years sent out a new patch series on Wednesday to better clean-up the ChaCha and BLAKE2s code. As part of that patch series is also enabling the architecture-optimized BLAKE2s code by default, similar to the process other crypto algorithms have gone through.

Of the patch series most interesting is the patch to always enable the arch-optimized BLAKE2s code. There he argues:

“When support for a crypto algorithm is enabled, the arch-optimized implementation of that algorithm should be enabled too. We’ve learned this the hard way many times over the years: people regularly forget to enable the arch-optimized implementations of the crypto algorithms, resulting in significant performance being left on the table.

Currently, BLAKE2s support is always enabled (‘obj-y’), since random.c uses it. Therefore, the arch-optimized BLAKE2s code, which exists for ARM and x86_64, should be always enabled too. Let’s do that.

Note that the effect on kernel image size is very small and should not be a concern. On ARM, enabling CRYPTO_BLAKE2S_ARM actually *shrinks* the kernel size by about 1200 bytes, since the ARM-optimized blake2s_compress() completely replaces the generic blake2s_compress(). On x86_64, enabling CRYPTO_BLAKE2S_X86 increases the kernel size by about 1400 bytes, as the generic blake2s_compress() is still included as a fallback; however, for context, that is only about a quarter the size of the generic blake2s_compress(). The x86_64 optimized BLAKE2s code uses much less icache at runtime than the generic code.”

In the case of the x86_64 optimized BLAKE2s, this allows for SSSE3 and AVX-512 usage for faster BLAKE2s cryptographic hashing performance.

Linux Patches To Unconditionally Enable Architecture-Optimized BLAKE2s Support

Leave a Reply Cancel reply

Stay Connected

Latest News

How to Contribute to GitHub Without Breaking Anything | HackerNoon

The Nothing Phone 3a Lite is most of the Phone 3a, only cheaper

Best Marketing Software and Services for Small Businesses

Commvault (CVLT) Reports Earnings Tomorrow: What to Expect

World of Software is your one-stop website for the latest tech news and updates, follow us now to get the news that matters to you.

Quick Link

Topics

Sign Up for Our Newsletter

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.

Leave a Reply Cancel reply

Stay Connected

Latest News