By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
World of SoftwareWorld of SoftwareWorld of Software
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Search
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
Reading: ARM NEON Accelerated CRC64 Optimization Shows Nearly 6x Improvement
Share
Sign In
Notification Show More
Font ResizerAa
World of SoftwareWorld of Software
Font ResizerAa
  • Software
  • Mobile
  • Computing
  • Gadget
  • Gaming
  • Videos
Search
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Have an existing account? Sign In
Follow US
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
World of Software > Computing > ARM NEON Accelerated CRC64 Optimization Shows Nearly 6x Improvement
Computing

ARM NEON Accelerated CRC64 Optimization Shows Nearly 6x Improvement

News Room
Last updated: 2026/03/17 at 6:01 AM
News Room Published 17 March 2026
Share
ARM NEON Accelerated CRC64 Optimization Shows Nearly 6x Improvement
SHARE

A patch posted today to the Linux kernel mailing list provides an ARM64-optimized CRC64-NVMe implementation for nearly a 6x improvement on modern Arm SoCs.

Open-source developer Demian Shulhan added this NEON-optimized CRC64 implementation, similar to the other architecture-specific CRC64 implementations such as for x86_64 and RISC-V. The intent on this CRC64 speed-up is for benefiting NVMe and other storage devices in addressing this bottleneck.

Shulhan explained in the patch and the nearly 6x gain was for an Arm Crotex-A72 SoC. He wrote:

“Implement an optimized CRC64 (NVMe) algorithm for ARM64 using NEON Polynomial Multiply Long (PMULL) instructions. The generic shift-and-XOR software implementation is slow, which creates a bottleneck in NVMe and other storage subsystems.

The acceleration is implemented using C intrinsics (arm_neon.h) rather than raw assembly for better readability and maintainability.

Key highlights of this implementation:
– Uses 4KB chunking inside scoped_ksimd() to avoid preemption latency spikes on large buffers.
– Pre-calculates and loads fold constants via vld1q_u64() to minimize register spilling.
– Benchmarks show the break-even point against the generic implementation is around 128 bytes. The PMULL path is enabled only for len >= 128.
– Safely falls back to the generic implementation on Big-Endian systems.

Performance results (kunit crc_benchmark on Cortex-A72):
– Generic (len=4096): ~268 MB/s
– PMULL (len=4096): ~1556 MB/s (nearly 6x improvement)”

It’s surprising it took until now to see an ARM64/NEON-optimized CRC64 implementation for the Linux kernel at just a little more than one hundred lines of code.

benchmark results of NEON CRC64 implementation

The patch is now out for review on the Linux kernel mailing list.

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.
By signing up, you agree to our Terms of Use and acknowledge the data practices in our Privacy Policy. You may unsubscribe at any time.
Share This Article
Facebook Twitter Email Print
Share
What do you think?
Love0
Sad0
Happy0
Sleepy0
Angry0
Dead0
Wink0
Previous Article These are my 7 favorite Android weather apps that I think you’ll love too These are my 7 favorite Android weather apps that I think you’ll love too
Next Article UK must learn lessons from AI race and retain its quantum computing talent, says minister UK must learn lessons from AI race and retain its quantum computing talent, says minister
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Stay Connected

248.1k Like
69.1k Follow
134k Pin
54.3k Follow

Latest News

Make 0K on Etsy in Two Years (Step-by-Step Guide)
Make $530K on Etsy in Two Years (Step-by-Step Guide)
Computing
4 Mistakes To Avoid When Cleaning Your Earbuds – BGR
4 Mistakes To Avoid When Cleaning Your Earbuds – BGR
News
Apiiro launches command-line interface to bring AI-native security into software development workflows –  News
Apiiro launches command-line interface to bring AI-native security into software development workflows – News
News
ROAD TO BEYOND 2025: UNLOCKING EAST ASIA’S INNOVATION ECOSYSTEM IN JAPAN AND KOREA · TechNode
ROAD TO BEYOND 2025: UNLOCKING EAST ASIA’S INNOVATION ECOSYSTEM IN JAPAN AND KOREA · TechNode
Computing

You Might also Like

Make 0K on Etsy in Two Years (Step-by-Step Guide)
Computing

Make $530K on Etsy in Two Years (Step-by-Step Guide)

13 Min Read
ROAD TO BEYOND 2025: UNLOCKING EAST ASIA’S INNOVATION ECOSYSTEM IN JAPAN AND KOREA · TechNode
Computing

ROAD TO BEYOND 2025: UNLOCKING EAST ASIA’S INNOVATION ECOSYSTEM IN JAPAN AND KOREA · TechNode

4 Min Read
How to Build a Documentary-Style YouTube Channel That Stands Out
Computing

How to Build a Documentary-Style YouTube Channel That Stands Out

16 Min Read
Samsung to scale back appliances, TV and display businesses in China, sources say · TechNode
Computing

Samsung to scale back appliances, TV and display businesses in China, sources say · TechNode

1 Min Read
//

World of Software is your one-stop website for the latest tech news and updates, follow us now to get the news that matters to you.

Quick Link

  • Privacy Policy
  • Terms of use
  • Advertise
  • Contact

Topics

  • Computing
  • Software
  • Press Release
  • Trending

Sign Up for Our Newsletter

Subscribe to our newsletter to get our newest articles instantly!

World of SoftwareWorld of Software
Follow US
Copyright © All Rights Reserved. World of Software.
Welcome Back!

Sign in to your account

Lost your password?