VSAN large VMs are they tied to one host or spread out over multiple hosts

Hello, We are looking into replacing our vsphere env as our hardware is reaching end of life.  We had multiple vendors come in and one of options is some kind of hyper-converge system such as vsan. 

One concern is we have about 4 large VMs that have vmdks between 10tb and 16tb. 

How does VSAN place data on the hosts with storage.  Does VSAN keep the entire 16TB vmdk on one host or Disk Group, or does it spread out the data of the VMDK between the Hosts with storage.

Im hoping it spreads it out, my concern would be if we only have lets say 17TB of disk in a host or disk group, and VSAN cant spread out the vmdk data, then that one large vm would only reside on the one host, leaving no storage for other VMs.  So then the host might only be able to run that 1 vm which would be inefficient.

My guess is this is not the case and that VSAN distributes the data around multiple disk groups across hosts. 

Hopefully that made sense. 


0 Kudos
1 Reply
VMware Employee
VMware Employee

Hello Mike,

vSAN can of course split a data-replica (LSOM-Object) across the storage on multiple nodes but the capability to do this depends on the number of nodes in the cluster, their available storage and the FTM applied to the Objects (e.g. RAID1 or RAID5/6).

So, say you had a 5-node cluster with 10TB capacity-tier storage per node and you placed a 15TB vmdk on it with FTT=1,FTM=RAID1 Storage Policy(SP) - this would be placed like so:

15TB replica (7.5TB on node1, 7.5TB on node2) + 15TB replica (7.5TB on node3, 7.5TB on node4) + 16MB witness component (on node5). As you can see, this wouldn't be possible in a 3-node cluster (as it would violate the SP rules to place data-components (or a portion thereof) in the same Fault Domain (node here) as the witness component).

In the same cluster using a FTT=1,FTM=RAID5 SP this 15TB vmdk Object would be a ~19.95TB total usage split over 4 nodes evenly and thus placement is less complex as replicas wouldn't need to be striped across nodes but do bear in mind that RAID5 is only available on All-Flash clusters and has lower performance than RAID1.

One should also bear in mind whether it will be possible to rebuild Objects back to FTT=1 following a permanent node failure as it is not just about free space but where you have this (and what resides there already).

In general I wouldn't advise having Objects that are vastly bigger than each nodes storage capacity unless you have a bigger cluster (e.g. 8+).


0 Kudos