4 Replies Latest reply on Jul 13, 2018 12:20 PM by TheBobkin

    VM Fault Tolerance + vSAN

    Youpy Lurker



      I'm trying to define a redundant solution as well for the storage and the VM resources.


      I want create 4 VM of 4 vCPU each:

      - Each VM use 100GB of storage

      - 2 VM will use Fault Tolerance

      - 2 VM will use High availability


      I want to use vSAN with RAID-1 and FTT=1.


      My conclusion is that the following setup is possible:

      - 3 hosts with 32GB RAM

      - Each host has with 2 disks:

           - 1 SSD 100GB for the cache

           - 1 HDD 1TB for the capacity



      - 1Gbit/s VMware OAM

      - 10Gbit/s network for Fault Tolerance

      - 10Gbit/s network for vSAN


      Could you please tell me if my understanding is correct ?





        • 1. Re: VM Fault Tolerance + vSAN
          IRIX201110141 Expert

          Some thoughts...

          - For a 4 vCPU FT VM you need Enterprise Plus licensing. The standard brings 2 vCPU

          - The old legacy FT starts with vSphere 4.0 and there was no copy of the data (vDisks) needed. With the current implementation you have to specify a 2nd. location and FT creates a copy of the complete VM when you enable FT. There is only one vSAN Datastore per Cluster so iam not sure about the FT behaviour. If it works... you have 4 copies of your data.... so keep this in mind

          - HA needs to be activated for FT VMs

          - During a planed maintanence peroid you will not have data redundancy with only 3 hosts because there is no host left to evacuate the data to

          - Think about going AFA instead of Hybrid when having such "small" capacity requirements. AFA is part of vSAN standard license



          Sorry for not giving a clear answer. I will it test later i creating a FT VM is possible ... we have a couple of vSANs up and running.




          • 2. Re: VM Fault Tolerance + vSAN
            Youpy Lurker

            Thanks Joerg for the advises.


            Even if not optimal I understand that this setup should work.

            I'll increase the disk size, as the data is copied 4 times, I'll lack capacity.


            I'll also have a look at AFA, as I don't know it (I'm very new to this).




            • 3. Re: VM Fault Tolerance + vSAN
              IRIX201110141 Expert

              Ok.. i can answer my own question. It is possible to place the vDisk copy, which is the current way FT works, on the same Datastore (vSAN) as the vDisks from the Primary VM.



              I created successfully a FT protected vSAN Cluster (vSphere 6.5u2 latest) and store Primary and Secondary VM on the one and only vSAN Datastore of that cluster.


              AFA = All Flash Array (SSD Only)

              If you only plan with one Capacity Disk in the Diskgroup and in total only a few TB are needed dont mess around with old HDDs and use SSDs for Cache and Capacity Device.




              • 4. Re: VM Fault Tolerance + vSAN
                TheBobkin Master
                vExpertVMware Employees

                Hello Tom,



                As you only require a very small number of VMs, might it be more economical to do a 2-Node Direct-Connect Set-up?

                Basically 2 nodes with no need for 10G switch and a small Witness Appliance (free license) running as a VM somewhere else (e.g. cloud/local/non-local cluster etc.).


                SMP-FT is supported on 2-node clusters:


                There are caveats for stretched-clusters and both VMs can reside on the same vsanDatastore, do note mixing is actually unsupported:

                "A mix of vSAN and other types of datastores is not supported for both Primary VMs and Secondary VM"



                "- 3 hosts with 32GB RAM"

                This is potentially going to be a bottleneck as ESXi and vSAN require some memory overhead with a base memory-consumption per Disk-Group of a bit over 5GB:


                How much RAM do these VMs require? (include the extra for secondary VMs).


                Single Capacity-tier Disk-Groups with single Disk-Group per host is fairly limited e.g. one capacity-tier fails means that host isn't contributing any storage and is not available for rebuilding components and is also going to be fairly limited from a performance perspective.