_steez
Contributor
Contributor

Local failures to tolerate in Stretched Cluster

Jump to solution

Hello,

I have a question about a Stretched cluster Failures to tolerate setting when creating a storage policy:

_steez_0-1649777618893.png

 

Do I need to tolerate any failures in Primary site? When testing failure scenarios - a host (containing VM data) was forsibly turned off, VMs still restarted on other host in the same site.

What should be the optimal configuration in this scenario (4+4 hosts)? As selecting No data redudancy saves storage space masivelly. Only downsides I can think of is impacted network performance (as VM data now has to travel between sites).

In what cases you would want to keep Failures to tolerate to RAID1/5 in stretched cluster deployments?

1 Solution

Accepted Solutions
depping
Leadership
Leadership

It depends on your expectations. there's a big benefit to having local redundancy:

  • If a host fails and your VM is impacted read IO will still be from the local fault domain
  • rebuilds can happen from the local fault domain
  • you can withstand two failures across the fault domains, instead of 1

but if you feel that you don't need that, or some VMs are less latency sensitive and a prolonged resync time doesn't matter, just go with no redundancy within the fault domain.

View solution in original post

1 Reply
depping
Leadership
Leadership

It depends on your expectations. there's a big benefit to having local redundancy:

  • If a host fails and your VM is impacted read IO will still be from the local fault domain
  • rebuilds can happen from the local fault domain
  • you can withstand two failures across the fault domains, instead of 1

but if you feel that you don't need that, or some VMs are less latency sensitive and a prolonged resync time doesn't matter, just go with no redundancy within the fault domain.