Released last year shortly after the EPYC 9005 “Turin” processor launch was ZenDNN 5.0 for Zen 5 optimized CPU inferencing with the likes of PyTorch and TensorFlow. ZenDNN 5.0 delivers up to a 400% performance uplift according to AMD engineers. Out today is ZenDNN 5.0.1 with further optimizations, particularly around recommendation engines and large language models (LLMs).
ZenDNN 5.0.1 is a small update focused on further pushing the performance envelope for recommender systems and large language models on AMD EPYC CPUs. ZenDNN 5.0.1 is a small update focused on further pushing the performance envelope for recommender systems and large language models on AMD EPYC CPUs. There is now support for INT8 and INT4 quantized deep learning recommendation models (DLRM) for faster inferencing and lower memory use compared to the previously preferred BF16 precision by ZenDNN.
That’s it in terms of the listed changes for ZenDNN 5.0.1. There weren’t any published benchmark numbers either from AMD of the impact they are seeing out of this ZenDNN point release but going from BF16 to INT4/INT8 can yield a significant boost. In any event those wanting to ZenDNN 5.0.1 for optimized CPU-based inferencing with PyTorch and TensorFlow on AMD Zen processors can find the new version via GitHub. The ZenDNN library remains available under an Apache 2.0 open-source license.