VMware Cloud Community
acollado175
Contributor
Contributor
Jump to solution

FT secondary vm power on failure

I had a vm running with FT in a two node HA cluster. To demonstrate failover, I simulated power failure on the node running the FT vm (by holding the power button), and viola, vm failed over and didn't skip a beat.

Apparently, power failure is a bad option to demonstrate FT since i destroyed my ESXi host and had to repair. Repair went well, rejoined my cluster, reconfigured HA, but FT will not properly configure. I have turned FT off and back on my vm, but when I go to power on, the primary powers up fine, but the secondary vm gives this error:

"Unable to access file <unspecified file> since it is locked."

My Fault Tolerance Status shows "Not protected. Need seconday vm"

I'm running in eval mode on an NFS share.

Has anyone seen this behavior? Why can't vcc spit out what file is locked?

0 Kudos
1 Solution

Accepted Solutions
admin
Immortal
Immortal
Jump to solution

Removing the host from the cluster doesn't remove aam.vgz (the HA agent). However, removing the host from VC altogether should uninstall aam as well as vpxa. Try that and then re-add the host back to VC.

View solution in original post

0 Kudos
4 Replies
admin
Immortal
Immortal
Jump to solution

Check out this KB: 10051 Virtual machine does not power on because of missing or locked files.

Rick Blythe

Social Media Specialist

VMware Inc.

0 Kudos
acollado175
Contributor
Contributor
Jump to solution

I went through that doc and since I'm running esxi, didn't have the vmware-cmd and just skipped to the part where I reboot my hosts. Both hosts (ESXi 4.0.0, 171294) fail to boot with "PANIC: Error while reading file: -5, aam.vgz." This is the same problem I ran into after killing power to one of the hosts, but why is it happening to the other one that didn't go down hard? If I repair using the ESXi install media, I can boot; but if I take them down again, I get the same PANIC error.

I am assuming aam is the automatic availablity manager part of HA, so I decided that after i get my hosts back online, I'd remove them from my HA cluster, thereby unconfiguring HA, and then reboot them to see if that would remove the offending module, but it didn't.

Is aam.vgz part of the ha agent installation?

Am I looking at rebuilding my hosts completely?

0 Kudos
admin
Immortal
Immortal
Jump to solution

Removing the host from the cluster doesn't remove aam.vgz (the HA agent). However, removing the host from VC altogether should uninstall aam as well as vpxa. Try that and then re-add the host back to VC.

0 Kudos
acollado175
Contributor
Contributor
Jump to solution

Success! My hosts reboot without PANIC attacks and I can enable FT on my vms.

0 Kudos