As part of their Project Battlematrix effort, Intel has been working on enhancing their Linux graphics driver support for multi-device usage scenarios with wanting to support up to eight Intel Arc Pro graphics cards per system to help with AI LLMs and other larger use-cases. The latest code posted from Intel engineers is their initial implementation of multi-device Shared Virtual Memory (SVM) support.
For working on multi-device GPU compute and similar, initial patches were posted this weekend in beginning to lay the foundation for multi-device SVM handling. This follows other Intel Xe kernel driver preparations for multi-device pinned device memory for multi GPUs, and other Intel multi-device GPU driver patches in recent months.
This initial multi-device SVM support is built around PCI Express Peer-To-Peer (P2P) functionality. Intel engineer Thomas Hellström explained with Saturday’s 15 patch series:
“This series aims at providing an initial implementation of multi-device SVM, where communitcation with peers (migration and direct execution out of peer memory) uses some form of fast interconnect. In this series we’re using pcie p2p.
In a multi-device environment, the struct pages for device-private memory (the dev_pagemap) may take up a significant amount of system memory. We therefore want to provide a means of revoking / removing the dev_pagemaps not in use. In particular when a device is offlined, we want to block migrating *to* the device memory and migrate data already existing in the devices memory to system. The dev_pagemap then becomes unused and can be removed.
Removing and setting up a large dev_pagemap is also quite time-consuming, so removal of unused dev_pagemaps only happens on system memory pressure using a shrinker.”
The code is now out for review. Intel’s plans call for most of the Project Battlematrix software features to finish being wrapped up in Q4.
