VMware Cloud Community
rrr171
Contributor
Contributor

Should/Can HA migrate a VM when ESX host loses all FC storage?

Should/Can HA migrate a VM when ESX host loses all FC storage?

In the event that all FC storage is lost to an ESX host (let's say human error - pulled both FC connections), will a virtual machine residing on this ESX host use HA to migrate to another host? If HA does not have this capability by default, are there any methods available to make this happen? We are using Netapp storage.

0 Kudos
8 Replies
AndreTheGiant
Immortal
Immortal

VMware HA make only network tests.

It does not handle problem on storages.

To do this, you have to build some scripts and use vCenter alarm to run them.

Andre

Andrew | http://about.me/amauro | http://vinfrastructure.it/ | @Andrea_Mauro
0 Kudos
GregMeathead
Enthusiast
Enthusiast

So in reality the VM should be migrated, because if FC is pulled the VM will loose the host, will have no network, cpu, mem, it would be nothing but a file on a storage, ESX will have no access to it and it will HA try to HA it.

Isnt this correct?

0 Kudos
RParker
Immortal
Immortal

ESX will have no access to it and it will HA try to HA it.

If the Fiber is gone, there will be no files to access in the first place. HA is for host (hence HIGH Availability). If your storage is toast, no OTHER host can reach those files either, so HA will do no good.

0 Kudos
GregMeathead
Enthusiast
Enthusiast

Bot what happens if only one hosts will loose FC?

thats the original question: Should/Can HA migrate a VM when ESX host loses all FC storage?

0 Kudos
AndreTheGiant
Immortal
Immortal

Should?

IMHO yes, if other hosts have a working connection with the storage.

Can?

No, cause the HA tests are based on network heartbeat and does not consider the storage "healt".

Andre

Andrew | http://about.me/amauro | http://vinfrastructure.it/ | @Andrea_Mauro
0 Kudos
joshp
Enthusiast
Enthusiast

Though not tested, it seems that if you are using "VM Monitoring" (VMware tools heartbeats), and you lose all FC storage on a particular host, the VM heartbeats will fail (because the VM will be down), and HA will restart the VM on another host in the HA cluster.

VCP 3, 4

www.vstable.com

VCP 3, 4 www.vstable.com
0 Kudos
chrisaug
Contributor
Contributor

I tend to agree with JoshP. If storage for one host is lost, the VM will cease to work. As long as VM monitoring is working, it should recognize the down host and restart iton another host that has storage.

0 Kudos
joshp
Enthusiast
Enthusiast

I had a chance to test this scenario today. What I found is interesting. After disconnecting all fiber to one ESXi cluster host the host and the running VM's appeared to continue to run. The only events that were logged was that all paths to datastores were gone. At about 3 minutes or so, the VM logged an event

"Migration from ****.***.gov to

****.****.gov and resource pool

4-NonProduction: No guest OS heartbeats are

being received. Either the guest OS is not

responding or VMware Tools is not configured

correctly."

So it appears that VM Monitoring detected the VM problem (well, host storage problem) and started the attempt to migrate the VM. Unfortunately, the migration task got stuck at 10% for well over 25 minutes. During this time the VM was down. I also noticed that all the datastores could not be browsed as if there were some type of lock on ALL the datastores. As soon as I plugged the fiber back into the host I was testing with all the datastores became immediately available across the cluster and the failed VM that was previously stuck at 10% finished its migration and began servicing requests.

So the short answer is that it appears VM Monitoring will handle the case where all storage paths are down on a host. However, I am not sure what caused the datastores across the cluster to appear locked and the migration to get stuck at 10%. I will be testing this again at a later date to see if I can obtain a conclusive answer.

VCP 3, 4

www.vstable.com

VCP 3, 4 www.vstable.com
0 Kudos