Userspace Memory Failure Recovery Policy via memfd¶
- Author
Jiaqi Yan <jiaqiyan@google.com>
Motivation¶
When a userspace process is able to recover from memory failures (MF) caused by uncorrected memory error (UE) in the DIMM, especially when it is able to avoid consuming known UEs, keeping the memory page mapped and accessible is benifical to the owning process for a couple of reasons:
The memory pages affected by UE have a large smallest granularity, for example 1G hugepage, but the actual corrupted amount of the page is only several cachlines. Losing the entire hugepage of data is unacceptable to the application.
In addition to keeping the data accessible, the application still wants to access with a large page size for the fastest virtual-to-physical translations.
Memory failure recovery for 1G HugeTLB is a good example. With memfd userspace process can control whether the kernel hard offlines its hugepages that backs the in-RAM file created by memfd.
User API¶
int memfd_create(const char *name, unsigned int flags)
MFD_MF_KEEP_UE_MAPPED
When
MFD_MF_KEEP_UE_MAPPEDbit is set inflags, MF recovery in the kernel does not hard offline memory due to UE until the returnedmemfdis released. IOW, the HWPoison-ed memory remains accessible via the returnedmemfdor the memory mapping created with the returnedmemfd. Note the affected memory will be immediately isolated and prevented from future use once the memfd is closed. By defaultMFD_MF_KEEP_UE_MAPPEDis not set, and kernel hard offlines memory having UEs.
Notes about the behavior and limitations
Even if the page affected by UE is kept, a portion of the (huge)page is already lost due to hardware corruption, and the size of the portion is the smallest page size that kernel uses to manages memory on the architecture, i.e. PAGESIZE. Accessing a virtual address within any of these parts results in a SIGBUS; accessing virtual address outside these parts are good until it is corrupted by new memory error.
MFD_MF_KEEP_UE_MAPPEDcurrently only works for HugeTLB, soMFD_HUGETLBmust also be set when settingMFD_MF_KEEP_UE_MAPPED. Otherwisememfd_createreturns EINVAL.UEK8 currently looses the entire UE impacted hugepage once the memfd is closed.