Open-Source & Rust-Written Burn MATMUL Kernels Can Compete With NVIDIA's CUDA/cuBLAS

The open-source and Rust-based Burn deep learning framework developed by Tracel AI shared that their open-source matrix multiplication kernel performance can compete with and even outperform the NVIDIA CUDA cuBLAS performance. Plus Burn isn’t limited to just NVIDIA GPUs but can work on most hardware/drivers, including a Vulkan back-end.

On Friday the Burn developers published a lengthy blog post going over their exciting MATMUL kernel performance relative to NVIDIA CUDA cuBLAS/CUTLASS and showing some really splendid results for this cross-platform, Rust open-source DL framework.

For those wanting to get straight to the exciting part:

“On CUDA, our Simple algorithm is remarkably fast and stable, nearly always outperforming the cuBLAS/CUTLASS reference. However, the MultiRow variant truly stands out in the end; it is also the top performer across the board on Vulkan.”

Some really enticing data. Those wanting to learn more about the Burn MATMUL kernel performance can see the Burn.dev blog post.

I haven’t looked at Burn previously until a Phoronix reader pointed it out but I’ll be checking out their open-source software for use in some possible future benchmarks, namely burn-bench.

Open-Source & Rust-Written Burn MATMUL Kernels Can Compete With NVIDIA’s CUDA/cuBLAS

Leave a Reply

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.

Leave a Reply Cancel reply

Leave a Reply