New FFmpeg AVX-512 Optimizations Hit Up To 36x The Performance Of Plain C Code

Some commits merged today to FFmpeg Git provide additional hand-tuned Assembly code for AVX-512 with capable Intel and AMD processors.

Open-source multimedia developer Niklas Haas today upstreamed some additional AVX2 and AVX-512 tuning to FFmpeg, on top of the multimedia library’s already vast array of hand-tuned code for leveraging Advanced Vector Extensions.

For FFmpeg’s avfilter scene_sad code, there is now an AVX-512 implementation added that comes in at 36.31x the speed of the plain C code, according to benchmarks run by Niklas Haas. There was already an AVX2 path that achieved 25x the performance of the common C code but now with AVX-512 is exceeding 36x the performance.

Another commit added high bit depth AVX2 and AVX-512 versions of the scene_sad avfilter code. There is around an 11x improvement over the common C code or around 22x when using AVX-512.

AVX-512 continues to pay off particularly with the latest AMD Zen 4 / Zen 5 and recent Intel Xeon processors.

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.

Leave a Reply Cancel reply

Leave a Reply