VMware Cloud Community
nrlparasher
Enthusiast
Enthusiast

vSAN "none - Stretched Cluster" Ploicy

Hi,

I created a vSAN policy using H5 client and specified "Site disaster tolerance None - Stretched Cluster" and "Failures to tolerate 1 failure - RAID-1 (Mirroring)" as per understanding all components of all object must be created on a single site (FD) but when I tested the policy  find mostly it's true but it even sometimes create components in multiple sites (Screenshot attached). VM Home object's components are created in different site with a witness on each site.

Any help to understand the policy will be a great help. Thanks

VSAN-TS3.pngvSAN-TS2.pngvSAN-TS1.png

if you found this useful, please consider awarding points for "Correct" or "Helpful"
Tags (1)
9 Replies
depping
Leadership
Leadership

You need to specify the locality as well, if you don't specify the locality then components may be places across fault domains. I discussed this in the following blog post: http://www.yellow-bricks.com/2019/05/28/site-locality-in-a-vsan-stretched-cluster/

Vaskomozz
Contributor
Contributor

Hello depping, thanks for the post and clarification, I know that the subject is from 2019, but I found this post https://kb.vmware.com/s/article/88358, from 2022, which more or less is for the same thing as your post, and probably this still is valid. 

Currently, I am dealing with some tasks to save some space in the vSAN cluster, and I wanted to use the policy "None - Standard cluster" for non-critical (pre-prod VMs).

I want to ask, why using this policy with RAID-1 in the stretched cluster is critical?

Considering that for RAID-1 minimum of 3 hosts are needed, or 3 FD, regardless of the number of the hosts in the cluster when the cluster is stretched always will have 3 FD, and the placement should be the First object on the Prefered FD, the second object on secondary FD and witness on the witness host or third FD, I don't believe that is possible the Component of the objects to be placed on the Witness host, because the witness host has very small capacity, I already set this storage on several VMs and the result is like above. I don't know maybe for some smaller VMs, witnesses, and objects can be swapped, but I think that can be very rare cases. 

Best Regards.

Reply
0 Kudos
pkvmw
VMware Employee
VMware Employee

Hi @Vaskomozz,

as Duncan and the KB states, the "None - standard cluster" should not be used for Stretched Clusters, even for non-critical VMs. As the KB explains, this leads to vSAN making the wrong placement decisions and potentially ending up in impactful for your entire environment - also affecting your critical VMs.

If you don't need FTT=1 across your sites, you can use "None - stretched cluster" or one of the "None - keep data on [...]" storage policies to pin the data on one specific site.

Regarding the Witness: The witness only holds metadata components and no actual VM data of your VMs.

- pk

Reply
0 Kudos
Vaskomozz
Contributor
Contributor

Hi @pkvmw,

Thanks for your replay!

I just thinking logically, and is not clear for me how can making the wrong placement decision when the number of FTD is equal to 3? maybe concern for Read locality is somehow valid, but for wrong placement with 3 FTD, I don't think.

Moreover, if the policy like this shouldn't be used for Stretched cluster why datastore is compatible with the this type of policies?

Reply
0 Kudos
depping
Leadership
Leadership

the policy is compatible as the compatibility check assesses if vSAN can apply the policy or not, the answer is yes. But that does not mean it is a smart policy to use.

Reply
0 Kudos
depping
Leadership
Leadership

with just 3 hosts it is impossible to make a wrong decision. however, if you expand your cluster to 4, 5 or 6 hosts it could happen that VMs are placed in a manner which makes no sense, which could easily lead to data loss in certain failure scenarios. PLease just use the policy which makes sense to use,  the policy designed for this scenario.

Reply
0 Kudos
Vaskomozz
Contributor
Contributor

Hello @depping

Thanks for you replay, I really respect you, and your very useful articles on yellow brick. 

As you mentioned, with 3 host is impossible to make wrong decision, 3 host are equal 3 FTD, If I'm not mistaken, when the cluster is configured as stretched cluster number of FTD will be again 3, and should be same as 3 host, and wrong decision  would be almost impossible. 

In meantime I found another your article for the same topic, https://www.yellow-bricks.com/2020/12/14/vms-which-are-not-stretched-in-a-stretched-cluster-which-po... , according to this article, referring to second part, for using none-standard policy, if I understand correctly, main concern here is that those VMs will be without local resiliency, but they will have site resiliency.

Potential issue would be in case of failure of Site A VM on site A will read data from objects stored on Site B, which can have impact on the performance due to that the traffic will go over ISL link. If this is acceptable for those VMs, or if two site are close and directly connected i.e. in same campus then even this shouldn't be an issue. 

Reply
0 Kudos
depping
Leadership
Leadership

That is correct. But as mentioned, if you go higher than 1+1+1 there could be a situation where part of the VM resides in Site A while the other part resides in Site B. 

This policy was never created to be used in a stretched environment, that is why we have stretched cluster policies. Why not simply create, and use, the policy which should be used, which is the stretched policy. By using that you avoid any potential issues in the future. 

Reply
0 Kudos
Vaskomozz
Contributor
Contributor

I agree, if it is higher then 1+1+1 then can be an issue, with the current design I believe that is safe.

Also, I agree that the best option is to use the policies for stretched cluster, but currently we want to save some space with using this policies because we are short on space, with using this policy we achieving that, and in the same time we keeping site resiliency which is more important then the performance issue which we can have.

Thanks again for effort and clarification on the topic. 

Reply
0 Kudos