VMware Cloud Community
mummonth
Enthusiast
Enthusiast
Jump to solution

HA restarts VMs with esx.problem.vm.kill.unexpected.forcefulpageretire event

Hello!

4 VMs were restarted recently by HA, in VM Events I see this:

The VM using the config file {1} contains the host physical page {2} which was scheduled for immediate retirement. To avoid system instability the VM is forcefully powered off.

I suppose ESXi wanted to swap RAM page belonging to this VMs and failed due to timeout, then it just decided to restart VM on another host. Am I right?

How can we prevent these issues from happening again? Could this be HW related?

Reply
0 Kudos
1 Solution

Accepted Solutions
mummonth
Enthusiast
Enthusiast
Jump to solution

It ended up to be RAM module error, which showed up on IDRAC 5h later after incident, ordered replacemnt from Dell.

---------------------------------------------------------------------------------------------------------

Was it helpful? Let us know by completing this short survey here.

View solution in original post

Reply
0 Kudos
3 Replies
mummonth
Enthusiast
Enthusiast
Jump to solution

It ended up to be RAM module error, which showed up on IDRAC 5h later after incident, ordered replacemnt from Dell.

---------------------------------------------------------------------------------------------------------

Was it helpful? Let us know by completing this short survey here.

Reply
0 Kudos
5mall5nail5
Enthusiast
Enthusiast
Jump to solution

Totally curious but was this as a result of pro-active HA?

Reply
0 Kudos
abhimanyu7
VMware Employee
VMware Employee
Jump to solution

i also faced this issue with one of infra, where one VM got restarted by ha on some other host.

I checked and found that the the application monitor was on the Guest OS and while some app got crashed it sent the NMI to the Guest with the help of VMtools to restart it, in order to avoid corruption. Need to check with Guest OS dump for more deep dive into this issue.

(not a hypervisor/vmware issue)

Reply
0 Kudos