Highlighted
Enthusiast
Enthusiast

vSAn Stretched Cluster SFTT=0

Hello, all.

I'm trying to size a VxRail cluster and I need to understand the implications with regard to failures of a 2+2+1 set up. I assume that a 2+2+1 set up means a PFTT=1 (default - and I want that) and an SFTT=0 (??). Correct on the SFTT? If so, then with an SFTT of 0, how are different local failures handled (short of an entire site failure)? I cant seem to find any helpful material from VMware on this scenario. I mean a nice deep dive. 

Also, VMware doesn't seem to go too deep on failure scenarios in vSAN in general. 

Can anyone help?

Tags (1)
0 Kudos
8 Replies
Highlighted
VMware Employee
VMware Employee

@RSEngineer 
"I assume that a 2+2+1 set up means a PFTT=1 (default - and I want that) and an SFTT=0"
Correct as a minimum of 3 data-nodes per site would be required for placement of SFTT=1,SFTM=RAID1 Objects (and 4-nodes each side if wanted to use SFTT=1,SFTM=RAID5).

 

If a node failed on one site then it would attempt to repair as much of the data from that site as possible (assuming it won't all fit) on the remaining node on that site, if a Disk-Group failed then it would repair the data from it on either the other node on that site or the remaining Disk-Groups on the node with the failure (assuming it has multiple Disk-Groups) - it would violate a PFTT=0 Storage Policy to try to repair any of these data onto the other site in the cluster as then both copies would be in a single Fault Domain.

 

"Also, VMware doesn't seem to go too deep on failure scenarios in vSAN in general. "
Sorry but I completely disagree and not to be mean but these can be found with very simple Google searches e.g.:
https://core.vmware.com/resource/vsan-availability-technologies#section13
https://docs.vmware.com/en/VMware-vSphere/7.0/com.vmware.vsphere.vsan-monitoring.doc/GUID-35A4B700-6...
https://docs.vmware.com/en/VMware-vSphere/7.0/com.vmware.vsphere.vsan.doc/GUID-08911FD3-2462-4C1C-AE...

 

While technically personal blogs, @CHogan  and @depping  are VMware employees that have been working with vSAN since the start and frequently cover in very deep detail how vSAN reacts in failure scenarios depending on the configuration etc.:
http://www.yellow-bricks.com/
https://cormachogan.com/vsan/

Tags (1)
0 Kudos
Highlighted
Enthusiast
Enthusiast

Thank you very much for the feedback. Let me ask a few things further. 

"If a node failed on one site then it would attempt to repair as much of the data from that site as possible (assuming it won't all fit) on the remaining node on that site, "

For an SFTT of 0, that seems quite "tolerant," no? Let me give a scenario to showcase what I mean. 

Site A has 2 nodes - N1 and N2. 

Site B has 2 Nodes - N3 and N4

A VM is running on N1 and its VMDK is on N2. N2 dies altogether. N1 survives. I would have thought that no attempt to repair or rebuild would take place at Site A because the SFTT=0. I would have thought that the mirrored copy of that VMDK on Site B would be immediately leveraged by vSAN and that the VM in Site A would remain in place (adhering to Site affinity rules) and access I/O across the inter-site link. 

But if repairing and rebuilding DO INDEED take place at Site A, as you describe, and it has a SFTT of 0, then why do we say there is no protection? Can you clarify what exactly - in practice - an SFTT of 0 really means? 

0 Kudos
Highlighted
VMware Employee
VMware Employee

there's  a very extensive stretched cluster guide to be found here:

https://core.vmware.com/resource/vsan-stretched-cluster-guide

it is pretty straight forward:

PFTT = Primary Failures To Tolerate = Protection Across Locations

SFTT = Secondary Failures To Tolerate = Additional Protection Within Locations

With PFTT you ensure a copy of the data is available in both locations, with SFTT you can protect that copy additionally locally as well against failures. So you have a RAID-1 configuration across locations, and RAID-1,5 or 6 within the location potentially if desired.

What is the benefit of SFTT? It adds two things:

1. If a local copy fails, data repair will happen locally

2. it adds an extra level of availability, as you can tolerate more failures before the VM becomes inaccessible

So even if PFTT=1 and SFTT=0 and a disk fails in a location, vSAN will still try to repair the impacted VM/Objects to meet the specified policy!

0 Kudos
Highlighted
Enthusiast
Enthusiast

Why do my posts keep getting deleted!?!?!

I HAVE RESPONDED TO DUNCAN EPPING 3 TIMES! AND EVERY TIME I DO, THE POST GETS DELETED!  WHY???

0 Kudos
Highlighted
VMware Employee
VMware Employee

@RSEngineer, If it helps any, we can see this one.

0 Kudos
Highlighted
Enthusiast
Enthusiast

Thanks, TheBobkin.

 

Must say...It's annoying. Please forgive my complaining, but this board is flaky. Has so many quirks. I typed a nice, detailed response to Duncan and reposted it 3 times. 

 

B\OK, so bottom line...

 

I will accept that even with an SFTT of 0 that vSAN will try to rebuild WITHIN the site where the failure occurred. But I don't see why if the SFTT is 0. Zero to me, and to every other engineer who I know works with vSAN and sells it as an SE,  means that, once node 2 in my example fails, vSAN should failover to the mirrored VMDK and that's that. If even with an SFTT of 0 vSAN is still going to try to rebuild at the site where the failure occurred, then why add extra nodes and waste money? Just do a 2+2+1, as opposed to say, 3+3+1. Just add some extra disk on the 2 nodes at each site and let the rebuild occur. 

 

In fact, some people think that the site should have failed altogether once node 2 failed with an SFTT of 0 for that site -- and all VMs should fail to the secondary site. Again, I accept that that is not the case, but it still doesn't make much sense. Think of a 3-node cluster with an FTT of 1 and RAID 1 in place. If 1 node fails, your SPBM will be in violation (FTT of 0) AND if another failure a\occurs, the whole cluster will be down because you will have lost quorum. 

0 Kudos
Highlighted
Enthusiast
Enthusiast

Thanks, TheBobkin.

 

Must say...It's annoying. Please forgive my complaining, but this board is flaky. Has so many quirks. I typed a nice, detailed response to Duncan and reposted it 3 times. 

 

OK, so bottom line...

 

I will accept that even with an SFTT of 0 that vSAN will try to rebuild WITHIN the site where the failure occurred. But I don't see why if the SFTT is 0. Zero to me, and to every other engineer who I know works with vSAN and sells it as an SE,  means that, once node 2 in my example fails, vSAN should failover to the mirrored VMDK and that's that. If even with an SFTT of 0 vSAN is still going to try to rebuild at the site where the failure occurred, then why add extra nodes and waste money? Just do a 2+2+1, as opposed to say, 3+3+1. Just add some extra disk on the 2 nodes at each site and let the rebuild occur. 

 

In fact, some people think that the site should have failed altogether once node 2 failed with an SFTT of 0 for that site -- and all VMs should fail to the secondary site. Again, I accept that that is not the case, but it still doesn't make much sense. Think of a 3-node cluster with an FTT of 1 and RAID 1 in place. If 1 node fails, your SPBM will be in violation (FTT of 0) AND if another failure a\occurs, the whole cluster will be down because you will have lost quorum. 

0 Kudos
Highlighted
VMware Employee
VMware Employee

The problem is that people don't understand what vSAN is.

vSAN is a distributed object based storage system. Availability is specified on a per object basis. If you create a policy and that policy states:

1 copy of the data per fault domain (PFTT=1), then that is what you get! Even when a host fails. As long as there is a remaining host in the other fault domain vSAN will try to comply to the policy that you created.

With vSAN there's no such a thing as a failed site really. The RAID tree with a location for an object may be inaccessible, but that doesn't render the site failed. Failures are on a per object basis, individually, even when all hosts within a site are down, even then all components in that site will be marked as inaccessible.

0 Kudos