Solved: Local failures to tolerate in Stretched Cluster

_steez · ‎04-12-2022

Hello,

I have a question about a Stretched cluster Failures to tolerate setting when creating a storage policy:

Do I need to tolerate any failures in Primary site? When testing failure scenarios - a host (containing VM data) was forsibly turned off, VMs still restarted on other host in the same site.

What should be the optimal configuration in this scenario (4+4 hosts)? As selecting No data redudancy saves storage space masivelly. Only downsides I can think of is impacted network performance (as VM data now has to travel between sites).

In what cases you would want to keep Failures to tolerate to RAID1/5 in stretched cluster deployments?

depping · ‎04-19-2022

It depends on your expectations. there's a big benefit to having local redundancy:

If a host fails and your VM is impacted read IO will still be from the local fault domain
rebuilds can happen from the local fault domain
you can withstand two failures across the fault domains, instead of 1

but if you feel that you don't need that, or some VMs are less latency sensitive and a prolonged resync time doesn't matter, just go with no redundancy within the fault domain.

View solution in original post

depping · ‎04-19-2022

It depends on your expectations. there's a big benefit to having local redundancy:

If a host fails and your VM is impacted read IO will still be from the local fault domain
rebuilds can happen from the local fault domain
you can withstand two failures across the fault domains, instead of 1

but if you feel that you don't need that, or some VMs are less latency sensitive and a prolonged resync time doesn't matter, just go with no redundancy within the fault domain.

All

Local failures to tolerate in Stretched Cluster