Cloudflare’s technical blog posts about their hardware and software efforts are always a treat to read. Their latest fascinating technical content is on their newest “Gen 13” server platform based around AMD EPYC Turin where they are now achieving 2x throughput and 50% better performance-per-Watt thanks to these latest-generation AMD EPYC server processors paired with software improvements too.
Two years ago Cloudflare outline their choice of AMD EPYC Genoa-X for their Gen 12 servers. For Gen 13 they are going with EPYC 9005 “Turin”, and in particular the flagship EPYC 9965 SKU. Even without any AMD EPYC 9005 3D V-Cache processors, Cloudflare engineers are finding outstanding results with EPYC Turin.
For AMD EPYC Turin they found great throughput performance but initially with latency regressions. But with Cloudflare’s FL2 transition to their Rust-based rewrite of their core request handling layer, they are finding much better results. FL2’s modern architecture with better memory access patterns ended up providing up to 50% more requests per CPU and up to 70% lower latency.
Cloudflare engineers also gave a shout out to AMD’s Platform Quality of Service “PQOS” extensions for having more fine-grained control over shared resources like cache and memory bandwidth.
Over their Gen 12 Genoa-X servers, Cloudflare found 2x throughput with their EPYC 9965-based Gen 13 servers, 50% better performance-per-Watt, and up to 60% higher rack throughput. Nice numbers though personally I am not at all surprised from them given all the EPYC Turin benchmarking over the past year and a half at Phoronix: the EPYC 9005 series is phenomenal.
More details for those interested in Cloudflare’s Gen 13 server platform via the Cloudflare blog. There is also a second blog post where they outline more of this AMD EPYC 9965 server layout and components along with more details on the ideal GB-per-core configuration, thermal efficiency, 100 GbE, PCIe 5.0, and more.
