Hello All,
I have 3 hosts with VSAN 6.2.
When host1 is network partitioned, VM does not restart on other hosts. Did not VM supposed to be killed as no quorum and restarted on host2 and host 3.
Hello,
Are ALL of the test VM Objects (Disks + Namespace + any present snapshots) using a Storage Policy with FTT=1?
If so, are they compliant with this policy?
How are you isolating just one host?
Does the VM you are testing have any individual host isolation response over-rides configured?
Do you have HA configured and enabled on this cluster?
Did you reconfigure HA after making any changes to the Networking in the cluster?
More useful information regarding HA in vSAN from depping :
http://www.yellow-bricks.com/2013/09/19/isolation-partition-scenario-with-vsan-cluster-handled/
(Relatively old article but I can't see anything obvious that has changed since)
Bob
-o- If you found this comment useful please click the 'Helpful' button and/or select as 'Answer' if you consider it so, please ask follow-up questions if you have any -o-
Hello,
thanks for replying.
Please see my response below.
Are ALL of the test VM Objects (Disks + Namespace + any present snapshots) using a Storage Policy with FTT=1? Yes
If so, are they compliant with this policy? Yes
How are you isolating just one host? I am doing network partition instead of host isolation by disconnecting NICs for VSAN connection.
Does the VM you are testing have any individual host isolation response over-rides configured? VM has default setting as cluster i.e. Leave VM power on, though same result on Shutdown VM on isolation, as scenario is network partition not isolation.
Do you have HA configured and enabled on this cluster? Yes
Did you reconfigure HA after making any changes to the Networking in the cluster? Yes
In the senario of 4 hosts, as per http://www.yellow-bricks.com/2013/09/19/isolation-partition-scenario-with-vsan-cluster-handled in third senario, VM would restart in host 3 or 4. I was thinking of same type of behaviour in 3 hosts as well. BUT VM does not power on in host 2 and 3 just goes unresposive and keep pinign on network.
Okay, thanks for clarifying all of that.
I think the point might be the difference between definition of 'isolation' and 'partition' here.
depping clarifies this better here:
https://communities.vmware.com/thread/514497
Thus I don't think in this scenario it will power off the VM.
Have you tested if the VM restarts after killing it?
By the way, a better test for pulling vSAN traffic than removing the NICs is to simply untick the 'vSAN Traffic' box on the configured vmkernel interface.
Bob
-o- If you found this comment useful please click the 'Helpful' button and/or select as 'Answer' if you consider it so, please ask follow-up questions if you have any -o-
Hi,
I understand difference between isolation and partition.
Just wanted to know if this is expected behaviour?
As per yellow bricks, looked like VM was supposed to be restarted at host 2 or 3 as that partition had more VM components.
Hello,
Yes, looks like it is expected behaviour:
"Note that the VM in Partition-1 will not be powered off, even if you have configured the isolation response to do so"
Isolation / Partition scenario with VSAN cluster, how is this handled?
Bob
-o- If you found this comment useful please click the 'Helpful' button and/or select as 'Answer' if you consider it so, please ask follow-up questions if you have any -o-
Hello,
But VM was supposed to restart at host 2 or 3 also, as they had quorum. But VM did not start anywhere.
Hi there,
What are your HA isolation addresses set to?
Hello,
It was default management gateway, so host could still ping management default gateway and made it network partition. I could work with host isolation, though network partition swings my mind.
you need to configure HA for when host is isolated. default is leave powered on.
VMware Virtual SAN & vSphere HA Recommendations - VMware vSphere Blog
this may help you.
When vSAN and HA is enabled in the same cluster the intergaent HA heartbeat traffic leverages the storage network not the management network, therefore in your scenario I believe this would be an isolation condition not a partition condition as your host would be unable to contact the isolation address, and your VM then behaved accordingly.
You can test this by changing the isolation response to 'Power off and restart' and perform the same test, your VM should be restarted on one of your other hosts.
vM
-----------------------
VCAP-DCD / VCAP-DCA / VCP-CLOUD / VCP-DT / VCP-NV / VCP6 / VCP5 / VCP4
-----------------------
vMustard.com
what does fdm.log say? Does it call it out as an isolation?
Hello,
Thanks all for replying to my doubt.
@martinriley :::To get host isolation scenario, no host isolation is put. though VSAN network is used in VSAN, isolation is not chahnged by default. It is network partition as default gateway of management network is reachable from host1.
@Byounghee::Network isolation setting does not help here as it is network partition.
@ depping:: In vCenter, it is mentioned as network partition and host 1 shows in group 1 and host 2 and host 3 in group 2.
Below is line form fdm.log, these lines seems to be talking about other 2 hosts in cluster, as they have 13 and 14 poweredon VMs.
2017-04-19T16:43:40.805Z verbose fdm[FFB13B70] [Originator@6876 sub=Cluster opID=SWI-3ab50c2a] [ClusterManagerImpl::ProcessSlavePowerOnListChanges] host host-13789 listVersion=7084356088372 isolated=false poweredOnVms=14
2017-04-19T16:43:40.805Z verbose fdm[FFB13B70] [Originator@6876 sub=Cluster opID=SWI-3ab50c2a] [ClusterManagerImpl::ProcessSlavePowerOnListChanges] host host-13785 listVersion=7082655673765 isolated=false poweredOnVms=13
For vSAN it shows the partition, you have 3 partitions by the looks of it, in other words each host is isolated from a VSAN stance. HA however says there's no (FDM log) isolation, hence the Isolation response it not triggered, and this is because the gateway is probably still reachable.
Hello Depping,
From screenshot, host1 is in group1 and host 2 and host 3 are in group 2. So only two partition, isn't it?
sorry, correct, two partitions from a vSAN perspective, no isolation from an HA perspective, hence nothing happened
Hello depping
Thanks for update.
HA uses same network as VSAN in VSAN cluster, so isolation or partition should be same from HA and VSAN perspective.
And i wanted to understand if this is behavior in 3 Node cluster in partition, how behavior is different in http://www.yellow-bricks.com/2013/09/19/isolation-partition-scenario-with-vsan-cluster-handled/ ,then in this case also VM should not restart anywhere, as VM is partitioned in this case also.
Regards
Ved
Difficult to say why this is from the outside. If you aren't getting what it is expected please contact VMware support and let them analyze the environment and the situation.