Hi all,
I guess this post is 50% question and 50% sharing of information.
I was just casually writing down some failure scenarios for VSAN, and the next thing I know, it became this monster table.
What I'm still unclear about, is the usefulness of Datastore Heartbeating (DH for short) in a VSAN enviornment. (Through FC, iSCSI or NFS datastores in addition to the VSAN datastore).
Sure, I get that it is not required and is not usually implemented in a VSAN cluster, but in all the outcomes in which DH is present, it seems to do more bad than good. (Outcomes A1,B1,C1 and D1 below)
If anyone has more information on the benefits of DH in a VSAN cluster, please let us know.
I'm specifically referring to this line from the VMware® vSAN™ Design and Sizing Guide (last checked on Mar/2018):
"Heartbeat datastores are not necessary for a vSAN cluster, but like in a non-vSAN cluster, if available, they can provide additional benefits. VMware recommends provisioning Heartbeat datastores when the benefits they provide are sufficient to warrant any additional provisioning costs."
... is pretty vague. I hope they would clarify the "additional benefits" in a future update.
Before you read any further, please be aware that:
Let's assume that we have a 6 host VSAN cluster. (again, for this topic it doesn't matter if it is a local 6 host VSAN cluster or a 3+3+1witness Stretched Cluster. The site failure scenarios for a Stretched Cluster is a whole other topic.).
Visually, we have: H-H-H-H-H-H
Outcome | Scenario: Network partition, E.g.: H-H-H---x----H-H-H, where the hosts are now split in to two groups. Note: Everything below are from the perspective of a specific VM and the host it is currently running on. For simplicity, let's say the VM only has a single VMDK. |
---|---|
A. | FTT=1(or above), you can end up with either A1 or A2 below. |
A1. | The network partition where the VM is running does not have quorum for the VM's VMDK data-components. |
| |
A2. | The network partition where the VM is running has quorum for the VM's VMDK data-components. |
| |
B. | FTT=0, you can end up with either B1 or B2 below. |
B1. | The network partition where the VM is running does not have the VM's VMDK data-component. |
| |
B2. | The network partition where the VM is running has the VM's VMDK data-component. |
|
Outcome | Scenario: Host Isolation, E.g.: H---x----H-H-H-H-H, where a single host is network isolated from the other hosts. |
---|---|
C. | FTT=1(or above), you can only end up with C1 below. |
C1. | An isolated host would not have component-quorum for any VMDK (reminder: FTT is 1 or above, so each VMDK would have more than one component): |
| |
D. | FTT=0, you can end up with either D1 or D2 below. |
D1. | If a VM's data-components are not located on the isolated host but the VM is running on the host: |
| |
D2. | If a VM's data-components are located on the isolated host and the VM is running on the same host: |
|
Outcome | Scenario: The entire network is down, where all hosts are network isolated from each other. |
---|---|
E. | FTT=1(or above), you can only end up with E1 below. |
E1. | An isolated host would not have component-quorum for any VMDK (reminder: FTT is 1 or above, so each VMDK would have more than one component): |
| |
F. | FTT=0 |
F1. | Same as D1, except there will be no HA Master to restart any VMs on any host. |
F2. | Same as D2. |
First of all, you have perception about what is good and what is bad. What may sound like a bad situation to you, may not be bad for me? Lets look at the scenarios:
You need to ask yourself first: how is the network connected? And what are the chances that a user can still connect to the VMs running in the partition which has the components go inaccessible? Why is this important? Well as the VMs may be restarted in the other location when you do not have a heartbeat datastore, this could easily lead to a situation where a client is connected to a server, writing data, but that server may never be able to write the data to disk. Very undesirable situation. I don't know if you tested these scenarios, but I ran through many of these in the past, and Windows could easily sit 5-10 minutes without being able to write to disk before blue screening. And depending on how it fails, the network will also be experiencing some very strange issues with duplicate mac addresses and duplicate ip's. It will not be pretty, hence to prevent this the heartbeat datastore is very valuable!
A1 >> Correct, but see above where this could be a problem
A2 >> Not a problem indeed, correct
B1 >> Correct, but "worse" depends on what your requirements are. I would prefer to avoid duplicate IPs and mac addresses personally
B2 >> Correct
C1 >> Again, I am not sure the situation is worse with a heartbeat datastore. See above
E1 >> This should not trigger the isolation response, there's no healthy host, but I have not tested this with vSAN to be honest.
So for my understanding, what is the purpose of this exercise? What are you trying to design for, or are you trying to prevent from happening?
I wroten an article on this topic a while back, actually multiple if you do a search on my blog.
vSphere HA heartbeat datastores, the isolation address and vSAN - Yellow Bricks
Anyway, the heartbeat datastore can be used during an isolation or partition to inform the other side what has happened. Datastore heartbeating is supported in stretched as well, you just need to make sure you have a "shared datastore" local to the location, which is also accessible remotely.
Also good to know, in a stretched cluster when there's a partition or an isolation then VSAN can kill the VMs in that particular segment of the partition where ALL components have become inaccessible. It can do the same for an isolation situation. You do not need the Isolation Response configured for that.
Hi Duncan,
Thanks for your reply.
I do follow your articles and read your HA Deep Dive. Where else can one get such insightful information on these topics?
This is just me trying to compile everything in a single place, primarily for my own referencing convenience.
I do understand how the Datastore Heartbeat could let the other partition find out more, but if you would be so kind as to read through events A1,B1,C1 and D1 in my post, you could see why I think having Datastore Heartbeat in a VSAN cluster would make things worst. Note: your point about VSAN killing VMs (VSAN.AutoTerminateGhostVm=1) did get a mention in A1.
In fact, you even said so in your own article you linked:
However, the VMs which are running on the isolated host are more or less useless as they cannot write to disk anymore. |
... which is the exact point I'm trying to make. Having Datastore Heartbeat in this situation makes the outcome worst because why wouldn't you want the HA Master to restart the VM elsewhere if the original VM has already lost access to its VMDK? For a Windows guest OS, it would have likely BSOD'd in 1 minute, rendering the original VM useless anyway. (To be clear, imagine a situation where the VM's production network on the ESXi host is also isolated/partitioned, meaning that there will be no IP conflict even if HA Master restarts the VM elsewhere.)
The lack of further elaboration on this point in the VMwarer vSAN Design and Sizing Guide doesn't help either. It just sort of lampshaded Datastore Heartbeat and VSAN in a short uninformative paragraph.
On a side note, I was hoping you would also shed some light on event E1, on whether all hosts would trigger the Isolation Response or none will.
Thank you for your time!
First of all, you have perception about what is good and what is bad. What may sound like a bad situation to you, may not be bad for me? Lets look at the scenarios:
You need to ask yourself first: how is the network connected? And what are the chances that a user can still connect to the VMs running in the partition which has the components go inaccessible? Why is this important? Well as the VMs may be restarted in the other location when you do not have a heartbeat datastore, this could easily lead to a situation where a client is connected to a server, writing data, but that server may never be able to write the data to disk. Very undesirable situation. I don't know if you tested these scenarios, but I ran through many of these in the past, and Windows could easily sit 5-10 minutes without being able to write to disk before blue screening. And depending on how it fails, the network will also be experiencing some very strange issues with duplicate mac addresses and duplicate ip's. It will not be pretty, hence to prevent this the heartbeat datastore is very valuable!
A1 >> Correct, but see above where this could be a problem
A2 >> Not a problem indeed, correct
B1 >> Correct, but "worse" depends on what your requirements are. I would prefer to avoid duplicate IPs and mac addresses personally
B2 >> Correct
C1 >> Again, I am not sure the situation is worse with a heartbeat datastore. See above
E1 >> This should not trigger the isolation response, there's no healthy host, but I have not tested this with vSAN to be honest.
So for my understanding, what is the purpose of this exercise? What are you trying to design for, or are you trying to prevent from happening?
I have just been trying to consolidate every VSAN HA outcome into one small block of condensed information. Kind of like a cheat sheet, if you will.
This could potentially save me and my clients alot time during POC exercieses, and/or during real VSAN HA scenarios that could require me and my team's effort to investigate and do post mortem reports. And now, I hope it could help the others who are reading this thread too.
I do understand and agree with you that the "bad" situations I described that can be cause by the Datastore Heartbeat, could just as easily demonstrate how valuable it is.
But then again, as you also said, it does depend on how it fails.
That is why I added the line about how let's assume a scenario where the isolated host's VM production network is also isolated along with the VSAN network, which effectively seals off the possibility of an IP/MAC conflict.
Thanks for clarifying E1, I'll update my notes. I haven't gotten around to testing this in VSAN either, so I had my suspicions on whether it would turn out the same as when using a FC shared lun.
Also, thanks for telling us that a Heartbeat Datastore can be used in a Stretched Cluster.
The reason why I thought it's not supported in a Stretched Cluster is because that's what it says in the VSAN Admin Guide:
Configure HA settings for the stretched cluster.
|
Thank you so much for your time.
I have everything I need on Datastore Heartbeat and VSAN now.
No problem, and it is always good to see people who aim to get to the bottom of things!
Hi Ducan,
For event E1 in my opening post, what actually happens is:
I'll again update the opening post.