AMD GPU Managed Memory Support Merged For The GCC 16 Compiler

Happy Holidays: 21+ years in providing Linux hardware reviews with more than 5,500 original reviews / featured multi-page articles and more than 48,400 original news articles. 99% of the content written by one individual. If you enjoy Phoronix.com, consider joining Phoronix Premium. This week is the Cyber Week promotion to help support all of our Linux/open-source hardware and software operations. Thank you for your consideration and support this holiday season.

When it comes to AMD Radeon/Instinct GPU compiler support much of the emphasis is on the LLVM/Clang compiler stack with their official AMDGPU LLVM shader compiler back-end as well as having the AOMP downstream compiler fork and the like. But the GNU Compiler Collection “GCC” does continue allow targeting AMD GPU targeting with its “AMDGCN” back-end and using the likes of the OpenMP API. It’s not too often seeing new AMD GPU activity there for GCC but merged today is now support for managed memory.

Merged today by BayLibre compiler engineer Andrew Stubbs is support for managed memory to the AMDGCN back-end and libgomp OpenMP library. This provides managed memory support for AMD GPUs for OpenMP usage with GCC and builds upon the CUDA Managed Memory support recently added to the NVPTX target in GCC.

The functionality is akin to AMD HIP’s hipMallocManaged function but with GCC not relying on the HIP library itself. The functionality has been tested aross a variety of GPUs from Vega/GFX9 and CDNA through RDNA3 graphics hardware.

The managed memory support with hipMallocManaged and similarly with CUDA Managed Memory allows for data to be shared and accessible to both the CPU and GPU using a single pointer for concurrent access. This builds upon the Heterogeneous Memory Management (HMM) functionality within the Linux kernel.

The updated GCC OpenMP documentation explains of the AMD Managed Memory support:

“This memory is accessible by both the host and the device at the same address, so it need not be mapped with map clauses. Instead, use the is_device_ptr clause or has_device_addr clause to indicate that the pointer is already accessible on the device. The ROCm runtime will automatically handle data migration between host and device as needed. Not all AMD GPU devices support this feature, and many that do require that -mxnack=on is configured at compile time. If managed memory is not supported by the default device, as configured at the moment the allocator is called, then the allocator will use the fall-back setting. If the default device is configured differently when the memory is freed, via omp_free or omp_realloc, the result may be undefined. If the current device does not support Unified Shared Memory (or it is not enabled with HSA_XNACK=1) then Managed Memory might still work, but allocations may only be visible to a single device (whichever was the default device when the first allocation was made).”

With the AMDGCN managed memory now in GCC Git, it will be part of next year’s GCC 16.1 stable release.

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.

Leave a Reply Cancel reply

Leave a Reply