VMware Cloud Community
doro516
Contributor
Contributor

How to inject memory errors

Normally, I test RAS feature using CentOs 9, in Centos I enable a module called einj (under the path: sys/kernel/debug/apei/einj).

using the einj module I can inject memory error into the system.  

Now, I am using  VMware ESxi and created a VM running Centos 9, but in that VM I'm not able to enable the module einj, is that restriction?   

 

thanks!

 

0 Kudos
3 Replies
bluefirestorm
Champion
Champion

There are two problems

(1) according to this https://docs.kernel.org/firmware-guide/acpi/apei/einj.html
the virtual firmware needs to supoort EINJ. There are no options within the virtual EFI related to WHEA or error detection/correction. Looking at sudo dmesg | grep ACPI (of an OpenSUSE Tumbleweed VM), there is no entry for EINJ. Similar entries exist for RSDP, XSDT, SRAT, etc but no EINJ.

(2) would not memory error detection require ECC RAM? If so, the virtual RAM of VMware VMs do not have ECC even if the host RAM has ECC.

Given these two issues, I doubt you can do EINJ testing using VMs.

 

0 Kudos
doro516
Contributor
Contributor

thanks for the answer.

I have another question.

 is there a way to translate a physical address from a VM created to a physical address in the VMware ESXi

thanks!

 

0 Kudos
bluefirestorm
Champion
Champion

If your question is about finding out the virtual address of a process in a VM to physical address in the host RAM, I don't know the answer.

My best guess is that you likely can as the virtual RAM is also managed by the CPU through Intel EPT. The virtual RAM is no longer managed using VMware software. You could probably use a debugger and translate the virtual address to VM physical address by looking up the PTEs/PFNs. Whether the translated address is the same as the host physical, I don't know. There is a likely problem if the host has multiple CPUs with NUMA (i.e. each having their own RAM and each CPU is a node) while the VM only sees a single virtual socket.

0 Kudos