Intel engineers today posted Linux kernel patches for plumbing a brand new Error Detection And Correction “EDAC” driver for the next-generation memory controller design debuting with Xeon Diamond Rapids.
This new driver is “imh_edac” and is being developed as a new solution rather than tacking onto the existing Intel EDAC driver code given several key differences with Diamond Rapids. Namely due to MMIO-based memory spaces rather than the memory controllers as PCI devices to the operating system and also avoiding the need to re-test/verify existing Xeon hardware support when making these changes. The patch series explained:
“Add a new EDAC driver for Intel Diamond Rapids CPUs. The reasons for a separate driver instead of building on top of previous EDAC driver are as follows:
1) The memory controllers of Intel Diamond Rapids server CPUs are not presented as PCI devices to the OS, unlike previous generations. The enumeration and all memory controller registers have been transitioned to MMIO-based memory spaces.
2) Modifications to previous EDAC driver for Diamond Rapids CPUs would require extensive validation checks against multiple platforms, including Ice Lake, Sapphire Rapids, Emerald Rapids, Granite Rapids, Sierra Forest, and Grand Ridge.
3) Future Intel CPUs will likely only need patches on top of this new EDAC driver. Validation can be limited to Diamond Rapids servers and future Intel CPU generations.”
IMH in the context of the new “imh_edac” and “imh_base” driver code is for Integrated Memory and I/O Hubs (IMH). The memory controllers within the Intel IMHs will be exposed as memory stacks to the processor.
The new patches also bump the maximum number of row bits for DRAM chips from 18 to 19 for Diamond Rapids. There is also two-level memory configuration detection for enabling ADXL 2-level memory error decoding.
See this patch series for the Intel IMH EDAC driver now undergoing review for supporting the next-generation Xeon “Diamond Rapids” processors.
