By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
World of SoftwareWorld of SoftwareWorld of Software
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Search
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
Reading: ARM NEON Accelerated CRC64 Optimization Shows Nearly 6x Improvement
Share
Sign In
Notification Show More
Font ResizerAa
World of SoftwareWorld of Software
Font ResizerAa
  • Software
  • Mobile
  • Computing
  • Gadget
  • Gaming
  • Videos
Search
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Have an existing account? Sign In
Follow US
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
World of Software > Computing > ARM NEON Accelerated CRC64 Optimization Shows Nearly 6x Improvement
Computing

ARM NEON Accelerated CRC64 Optimization Shows Nearly 6x Improvement

News Room
Last updated: 2026/03/17 at 6:01 AM
News Room Published 17 March 2026
Share
ARM NEON Accelerated CRC64 Optimization Shows Nearly 6x Improvement
SHARE

A patch posted today to the Linux kernel mailing list provides an ARM64-optimized CRC64-NVMe implementation for nearly a 6x improvement on modern Arm SoCs.

Open-source developer Demian Shulhan added this NEON-optimized CRC64 implementation, similar to the other architecture-specific CRC64 implementations such as for x86_64 and RISC-V. The intent on this CRC64 speed-up is for benefiting NVMe and other storage devices in addressing this bottleneck.

Shulhan explained in the patch and the nearly 6x gain was for an Arm Crotex-A72 SoC. He wrote:

“Implement an optimized CRC64 (NVMe) algorithm for ARM64 using NEON Polynomial Multiply Long (PMULL) instructions. The generic shift-and-XOR software implementation is slow, which creates a bottleneck in NVMe and other storage subsystems.

The acceleration is implemented using C intrinsics (arm_neon.h) rather than raw assembly for better readability and maintainability.

Key highlights of this implementation:
– Uses 4KB chunking inside scoped_ksimd() to avoid preemption latency spikes on large buffers.
– Pre-calculates and loads fold constants via vld1q_u64() to minimize register spilling.
– Benchmarks show the break-even point against the generic implementation is around 128 bytes. The PMULL path is enabled only for len >= 128.
– Safely falls back to the generic implementation on Big-Endian systems.

Performance results (kunit crc_benchmark on Cortex-A72):
– Generic (len=4096): ~268 MB/s
– PMULL (len=4096): ~1556 MB/s (nearly 6x improvement)”

It’s surprising it took until now to see an ARM64/NEON-optimized CRC64 implementation for the Linux kernel at just a little more than one hundred lines of code.

benchmark results of NEON CRC64 implementation

The patch is now out for review on the Linux kernel mailing list.

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.
By signing up, you agree to our Terms of Use and acknowledge the data practices in our Privacy Policy. You may unsubscribe at any time.
Share This Article
Facebook Twitter Email Print
Share
What do you think?
Love0
Sad0
Happy0
Sleepy0
Angry0
Dead0
Wink0
Previous Article These are my 7 favorite Android weather apps that I think you’ll love too These are my 7 favorite Android weather apps that I think you’ll love too
Next Article UK must learn lessons from AI race and retain its quantum computing talent, says minister UK must learn lessons from AI race and retain its quantum computing talent, says minister
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Stay Connected

248.1k Like
69.1k Follow
134k Pin
54.3k Follow

Latest News

Signal plans changes to protect against phishing
Signal plans changes to protect against phishing
Software
this connected bracelet wants to know your emotions before you
this connected bracelet wants to know your emotions before you
Mobile
China is thinking bigger with its Tiangong space station
China is thinking bigger with its Tiangong space station
Computing
How Remote Work and Smart Travel Technology Are Transforming Campervan Tourism in Spain
How Remote Work and Smart Travel Technology Are Transforming Campervan Tourism in Spain
Trending

You Might also Like

China is thinking bigger with its Tiangong space station
Computing

China is thinking bigger with its Tiangong space station

2 Min Read
Quantum computing has just simulated its largest molecule, 12,635 modeled atoms
Computing

Quantum computing has just simulated its largest molecule, 12,635 modeled atoms

5 Min Read
3D printed algae will replace our light bulbs
Computing

3D printed algae will replace our light bulbs

5 Min Read
This is a parachute that was cooked for Mars
Computing

This is a parachute that was cooked for Mars

3 Min Read
//

World of Software is your one-stop website for the latest tech news and updates, follow us now to get the news that matters to you.

Quick Link

  • Privacy Policy
  • Terms of use
  • Advertise
  • Contact

Topics

  • Computing
  • Software
  • Press Release
  • Trending

Sign Up for Our Newsletter

Subscribe to our newsletter to get our newest articles instantly!

World of SoftwareWorld of Software
Follow US
Copyright © All Rights Reserved. World of Software.
Welcome Back!

Sign in to your account

Lost your password?