The increase in complexity and scaling of workloads from AI has forced us to create increasingly powerful solutions. Western Digital has taken an important step forward in this regard with OpenFlex Data24 4000 NVME-OF series, a equipped storage platform With SSD NVME Kioxia CM7-V Series units and the peak data server: Aio.
This platform has demonstrated high scalability and great ease of use, all without having to give up a high level of performance in AI -related tasks. To demonstrate it, this platform has validated its performance results Mlperf Storage v2a benchmark that is considered as the gold standard of the sector for the comparative evaluation of performance of storage solutions under Ia.
The results obtained show that this architecture Not only does it offer high performance at scale, It is also able to maintain a high degree of efficiency and economy with a practical implementation, and without a software -defined storage layer (SDS).
MLPerf Storage Use GPU customer nodesthat is, systems that simulate the behavior of an AI server that accesses storage during training or inference to generate the typical I/O I/O charge patterns of the real world’s workload loads, to evaluate the performance that a storage platform is capable of offering in the scenarios of distributed through multiple clients of concurrent GPU.
The AI training tests used in the MLPERF storage suite They measure the effectiveness with which the system serves the workloads of AI that stress different aspects of the storage I/O, including performance and concurrence, in several deep learning models. There are two key work reference points used for MLPERF:
3D U-NET work loads
A deep learning model is used Based on medical images and volumetric segmentation. It is a much higher load for storage systems due to its large 3D input data sets and its intense data flow reading patterns. As such, it is a stricter reference point to demonstrate sustained performance of great bandwidth and low latency in multinode workflows. In this model:
OpenFlex Data24 of Western Digital achieved a sustained reading performance of 106.5 GB/S, saturating 36 GPUS NVIDIA H100 simulated In three physical client nodes, which demonstrates the capacity of the EBOF to easily handle high parallelism training tasks and intensive bandwidth use.
With the peak data server: AIO, OpenFlex Data24 was able to reach the 64.9 GB/S, saturating 22 GPUS NVIDIA H100 simulated From a single main server and a single client node.
Resnet50 workloads
This is a convolutionary neuronal network widely used for Image classification. It serves as a reference point for training performance, since it represents a balanced combination of data movement and calculation. With both random and sequential I/O patterns, and using medium -sized image readings, it is useful for assessing the ability of a system to manage high frequency accesses to smaller files and fast iteration cycles. In this model:
OpenFlex Data24 from Western Digital offered optimal performance in 186 GPUS NVIDIA H100 simulated and three client nodes, with an extraordinary GPU/unit ratio that reflects the efficient use of physical supports by the platform.
With the peak data server: AIO, OpenFlex Data24 He was able to saturate 52 GPUS NVIDIA H100 simulated From a single main server and a single client node.
OpenFlex Data24 uses Western Digital Rapidflex network adapters, which means that it allows us connect up to 12 hosts without having to resort to a switch. Kurt Chan, vice president and general manager, Western Digital Platforms Business, comentó:
“These results validate the disaggregated architecture of Digital Western as a powerful habilitator and cornerstone of the next-generation AI infrastructure, which maximizes the use of the GPU while minimizing the footprint, complexity and the total cost of property. The NVME-OF OpenFlex Data24 Series 4000 storage offers a performance close to saturation in the demanding benchmarks of AIboth independently and with a single PEK device: aio ai data server, which translates into faster results and a reduction in infrastructure expansion ”.