VMware Cloud Community
AsokaRoy
Contributor
Contributor
Jump to solution

HA not working with Hang Host

Hi, I have 6 servers in my cluster with ESX 4.0, so far I had two issues with host crash and HA didnt work, I have few VMs ebnabled with FT, I think that should work properly even HA didnt. Both times the host didnt have any FT enabled VMs.

What happen was slowly host died, means, vCenter shows it disconencted but I could ssh to server, for some time I see VMs are working and some time later it also shows as disconencted.

When I restart the host in ESX console all VMs start to appear, but the other time when it crashed, I had to reinstall ESX, luckley when HOST ESX startup, its VMs started to appear without any problem. It didnt allow me to remove from invetory and add.

Why didnt HA move the VM to another host when host crashed. I have many question from the customer.

and advice will be highly appreceated.

Rehards

0 Kudos
1 Solution

Accepted Solutions
JDLangdon
Expert
Expert
Jump to solution

The heart beat usually involves the ESX server pinging its gateway. If this hasn't been changed in your environment, you need to pull the management nics.

________________________________

Jason D. Langdon

View solution in original post

0 Kudos
6 Replies
Troy_Clavell
Immortal
Immortal
Jump to solution

What happen was slowly host died, means, vCenter shows it disconencted but I could ssh to server, for some time I see VMs are working and some time later it also shows as disconencted.

If you could SSH to it, then there was no HA event. HA monitors the heartbeat of your ESX Host(s). Having it show up in vCenter as disconnected is not an HA event, it may be a vpxa issue (vCenter Management agent) and this as well is not an HA issue. If those host is on-line, meaning you can SSH/ping to it, no HA event will be triggered.

AsokaRoy
Contributor
Contributor
Jump to solution

Hi, So if the host having problem, and still heartbeat could be achieved no help from HA, only solution is ti enable FT isnt it. In my case I lost access to VMs and eventualy host also died, in that situation no help from HA.

0 Kudos
JDLangdon
Expert
Expert
Jump to solution

If you have a host in a "crashed" state but you are able to SSH into the COS, you can disconnect the network cables from the COS to force an HA event. Keep in mind that this will result in a hard down of the VM's but it should initiate HA to bring them back on a different host.

________________________________

Jason D. Langdon

AsokaRoy
Contributor
Contributor
Jump to solution

Hi Jason D. Langdon, Thats a good suggestion, thanks, I wonder what cable to disconenct or disable from the switch, I have many ports to LAN side conenction and 2 ports to SAN and anoth 2 ports to management(VCenter). I think vCenter cable dicsonenct wont help, LAN side(for VM to conenct with rest of the workld) disconenct will trigger a HA event or will that be SAN side.

Regards

0 Kudos
JDLangdon
Expert
Expert
Jump to solution

The heart beat usually involves the ESX server pinging its gateway. If this hasn't been changed in your environment, you need to pull the management nics.

________________________________

Jason D. Langdon

0 Kudos
Yan_one
Contributor
Contributor
Jump to solution

Hi

i think the best thing to do when  host in ha cluster is  hanged is to restart the host .

i had this situation just a day ago .

when you just disconnect the host from network the vmfs file locks  are still there -so no other host can make the vm go up -to release the locks restart the hanged host .

when HA host is crashed most  times its works ok .

Thank you

0 Kudos