Skip navigation

This blog entry continues to get a lot of hits, so I thought I would keep it updated and reformat it a bit. VMware's Fault Tolerance is a great feature that has generated a lot of interest, and it is also a new feature of vSphere that will only continue to improve. With that being said, the list below is the current state of requirements and limitations for enabling FT virtual machines in vSphere.  The majority of this information came from the vSphere Pre-requisites Checklist, the VMware Fault Tolerance Datasheet and the Availability Guide. Other items were picked up in the forums or in the VMware knowledge base. kb article 1010601 "Understanding VMware Fault Tolerance" is a great kb resource to start with, if you are new to this feature.  kb 1022844 contains the changes to Fault Tolerance in vSphere 4.1.


Last updated: October 02, 2012




VMware FT is available in the following versions of vSphere: Enterprise, Enterprise Plus
Note: vSphere Advanced Edition is no longer available in vSphere 5.


A host must be certified by the OEM as FT-capable. Refer to the current Hardware Compatibility List (HCL) for a list of FT-supported servers.


Ensure that HV (Hardware Virtualization) is enabled in the BIOS.


Ensure that FT protected virtual machines are on shared storage (FC, iSCSI or NFS).


Ensure that the primary and secondary ESX hosts and virtual machines are in an HA-enabled cluster.


Ensure that there is no requirement to use DRS for VMware FT protected virtual machines; in this release VMware FT cannot be used with VMware DRS (although manual VMotion is allowed). - In vSphere 4.1, FT is integrated with DRS, which means that DRS can now load balance both the primary and secondary Fault Tolerant virtual machines.


Ensure that the primary and secondary ESX/ESXi hosts are running the same build of VMware ESX/ESXi.  Note: kb 1013637 published September 25, 2009 states that "When creating a cluster that will have fault tolerant virtual machines, the cluster should consist of all ESX hosts or all ESXi hosts and not a mix of ESX and ESXi hosts." - In vSphere 4.1, FT has a version associated with it, which means that the primary and secondary ESX do not need to be run at the same build number/patch level.


When you upgrade hosts that contain fault tolerant virtual machines, ensure that the Primary and Secondary VMs continue to run on hosts with the same ESX/ESXi version and patch level. In vSphere 4.1, FT has a version associated with it, which means that the primary and secondary ESX do not need to be run at the same build number/patch level.


Ensure that there are will be no more than four VMware FT enabled virtual machine primaries or secondaries on any single ESX/ESXi host (from the configuration maximums document.)


Ensure that at least gigabit NICs are used. (10 Gbit NICs can be used as well as jumbo frames enabled for better performance.) Each host must have a VMotion and a Fault Tolerance Logging NIC configured. The VMotion and FT logging NICs must be on different subnets.


Ensure that host certificate checking is enabled (enabled by default) before you add the ESX/ESXi host to vCenter Server.


Ensure that IPv4 is used for the Fault Tolerance logging network.  (HA does support IPv6 for management networks.)


Ensure that there is no user requirement to use NPT/EPT (Nested Page Tables/Extended Page Tables) since VMware FT disables NPT/EPT on the ESX host.


VMware Fault Tolerance requires a dedicated Gigabit Ethernet network between the physical servers, 10 Gigabit Ethernet should be considered if VMware FT is enabled for many virtual machines on the same host.


There are no limits on how many virtual machines in a VMware DRS or VMware HA cluster can be enabled for VMware FT, but every machine with VMware FT enabled takes up twice as much capacity; this should be built into the configuration.


Overhead is dependent on the workload and can be as low as 5% or as much as 20%.


If firewalls or other controls exist between ESX hosts, ports 8100, 8200 (Outgoing TCP, incoming and outgoing UDP) must be open.


Ensure that a resource pool containing fault tolerant virtual machines has excess memory above the memory size of the virtual machines. Fault tolerant virtual machines use their full memory reservation. Without this excess in the resource pool, there might not be any memory available to use as overhead memory.


To ensure redundancy and maximum Fault Tolerance protection, VMware recommends that you have a minimum of three hosts in the cluster. In a failover situation, this provides a host that can accommodate the new Secondary VM that is created.


Too Much Activity on VMFS Volume Can Lead to Virtual Machine Failovers - reduce the number of file system operations or ensure that the fault tolerant virtual machine is on a VMFS volume that does not have an abundance of other virtual machines that are regularly being powered on, powered off, or migrated using VMotion.


When Fault Tolerance is turned on, vCenter Server unsets the virtual machine's memory limit and sets the memory reservation to the memory size of the virtual machine. While Fault Tolerance remains turned on, you cannot change the memory reservation, size, limit, or shares.


Disabling the virtual machine restart priority setting for a fault tolerant virtual machine causes the Turn Off Fault Tolerance operation to fail. In addition, fault tolerant virtual machines with the virtual machine restart priority setting disabled cannot be deleted.


FT requires that the hosts for the Primary and Secondary VMs use the same CPU model, family, and stepping.


Hosts running the Primary and Secondary VMs should operate at approximately the same processor frequencies, otherwise the Secondary VM might be restarted more frequently. Platform power management features which do not adjust based on workload (for example, power capping and enforced low frequency modes to save power) can cause processor frequencies to vary greatly.


You cannot back up an FT-enabled virtual machine using VCB, vStorage API for Data Protection, VMware Data Recovery or similar backup products that require the use of a virtual machine snapshot, as performed by ESX/ESXi. To back up a fault tolerant virtual machine in this manner, you must first disable FT, then re-enable FT after performing the backup. Storage array-based snapshots do not affect FT.


Apply the same instruction set extension configuration (enabled or disabled) to all hosts.


Ensure that the processors are supported. (Download VMware SiteSurvey.) For VMware FT to be supported, the servers that host the virtual machines must each use a supported processor from the same category as documented below.

Intel Xeon based on 45nm Core 2 Microarchitecture Category:

31xx Series

33xx Series

52xx Series (DP)

54xx Series

74xx Series

Intel Xeon based on Core i7 Microarchitecture Category

Nehalem Series Group (any processor series here can be used):

34xx Series (Lynnfield)

35xx Series

55xx Series

65xx Series

75xx Series

Westmere Series Groups (each processor series must be used separately):

34xx Series (Clarkdale)

i3/i5 (Clarkdale)

36xx Series

56xx Series

AMD 3rd Generation Opteron Category

13xx and 14xx Series

23xx and 24xx Series (DP)

41xx Series

61xx Series

83xx and 84xx Series (MP)

View full details about processor and other requirements in kb 1008027







Virtual machines must be running on one of the supported guest operating systems. See VMware kb 1008027 for more information.


Mac OS X Server 10.6 is not supported.


The combination of the virtual machine's guest operating system and processor must be supported by Fault Tolerance (for example, 32-bit Solaris on AMD-based processors is not currently supported).


VMware FT requires virtual machines to have thick-eager zeroed disks.  Thin or sparsely allocated disks will be converted to thick-eager zeroed when VMware FT is enabled requiring additional storage space. The virtual machine must be in a powered-off state to take this action.


Ensure that the datastore is not using physical RDM (Raw Disk Mapping). Virtual RDM is supported.


VMware recommends that you use a maximum of 16 virtual disks per fault tolerant virtual machine.


The virtual machine cannot have more than 64GB of RAM.


Ensure that there is no requirement to use Storage VMotion for VMware FT VMs, since Storage VMotion is not supported for VMware FT VMs.


Ensure that NPIV (N-Port ID Virtualization) is not used, since NPIV is not supported with VMware FT.


Ensure that the virtual machines are NOT using more than 1 vCPU. (SMP is not supported.)


Ensure that there is no user requirement to hot add or remove devices since hot plugging devices cannot be done with VMware FT.


Ensure that USB Passthrough is not used.


Ensure that there is no user requirement to use USB (USB must be disabled) and sound devices (must not be configured) since these are not supported for Record/Replay (and VMware FT.)


Ensure that there is no user requirement to have virtual machine snapshots since these are not supported for VMware FT. Delete snapshots from existing virtual machines before protecting with VMware FT. Note: Client agents may be required for backups.


Ensure that virtual machine hardware is upgraded to v7.


Ensure that the virtual machines do not use a paravirtualized guest OS. Note: On September 22, 2009 it was announced that support for guest OS paravirtualization using VMware VMI to be retired from new products in 2010-2011.


Fault Tolerance is not supported with Paravirtual SCSI adapters.


The vmxnet3 adapter is not supported with Fault Tolerance.  See kb 1013757In vSphere 4.1, you can use vmxnet3 vNICs in FT-enabled virtual machines.


Some legacy network drivers are not supported. vmxnet2 is, but you might need to install VMware tools to access the vmxnet2 driver instead of vlance in certain guest operating systems.


Ensure MSCS clustered virtual machines will have MSCS clustering removed prior to protecting with VMware FT.


VMs can’t have any non-replayable devices (USB, sounds, physical CD-ROM, physical floppy)


The virtual machine must not be a template or linked clone.


The virtual machine must not have VMware HA disabled.


VMDirectPath is not available for FT virtual machines.


VMCI stream socket connections are dropped when a virtual machine is put into Fault Tolerance (FT) mode. No new VMCI stream socket connections can be established while in FT mode.


The hot plug device feature is automatically disabled for fault tolerant virtual machines. To hot plug devices, you must momentarily turn off Fault Tolerance, perform the hot plug, and then turn on Fault Tolerance.


Extended Page Tables (EPT)/Rapid Virtualization Indexing (RVI) is automatically disabled for virtual machines with Fault Tolerance turned on.


Software virtualization with FT is unsupported.


FT virtual machines cannot be replicated with the vSphere Replication feature in SRM 5.


Dynamic Disk Mirroring use in the guest OS is not supported.




In the situation where virtual machines are configured with Fault Tolerance, AppSpeed might not monitor these virtual machines fully in the current GA version. In some cases AppSpeed generates empty monitoring data caused by the passive virtual machine in the Fault Tolerance constellation. - kb 1013896


Fault Tolerant virtual machines that have a change tracking resource (CTK) listed in the virtual machine configuration will rapidly switch between ESX hosts when being powered on.  CTK must be disabled, or the CTK variables must be removed from the virtual machine configuration (.vmx) file. - kb 1013400


An Absolute Must Read: The Design and Evaluation of a Practical System for Fault-Tolerant Virtual Machines


If you know of any others, feel free to share.  As always, thanks for reading.

I have a customer that deployed NetApp's NFS as the storage for their VI3 infrastructure. After the implementation, there was some general confusion about thin provisioning and understanding how it works in the VMware VI3 environment. In researching these issues, here is what was found:


  1. Cloning thin provisioned disks will create thick disks.

  2. Move/Copy (SVMotion, cold migration w/move storage option) operations will convert thin disks to thick, including NFS volume to NFS volume operations.

  3. Disks created in vCenter and via VMware Converter are created "thin" by default.

  4. Running defrag utilities inside a Windows virtual machine will cause the associated thin disk(s) to grow to varying degrees.

  5. When a thin provisioned disk grows, a SCSI reservation takes place.

  6. If performance is the primary concern for a particular virtual machine, thin provisioned disks should not be used.


Bottom Line: Without additional work and/or operational procedures, cloning, storage VMotion and even cold migrations will convert thin disks to thick disks. vSphere addresses these issues by supporting thin provisioning, but in the meantime - check out Kent's blog for a great workaround for converting thick disks to thin.


Now that the operational limitations and the realities of thin provisioned disks were understood, there also was a need to determine true disk allocation and usage. 


To discover what the totals are for all allocated VMDK files, run the following command from the /vmfs/volumes directory:

find . -name '.snapshot' -prune -o -name "*-flat.vmdk" -exec ls -lh {} \;

This command will exclude the hidden NetApp snapshot directory and only return the "flat" vmdk files in the listing.  The output will contain this block of information:







Adding the sizes up will show that 40Gb of space has been allocated. 


The next step is to discover what the total disk used value actually is.  To do this, run the following command from the /vmfs/volumes directory:

find . -name '.snapshot' -prune -o -name "*-flat.vmdk" -exec du -sh {} \;

This command will exclude the hidden NetApp snapshot directory and only return the "flat" vmdk files in the listing.  The output will appear as:







Adding the sizes up will show that 7.2Gb is actually being used on disk. These numbers can be verified by viewing the free space value of the datastores in the VMware Infrastructure Client, NetApp FilerView or NetApp System Manager application.


Dividing the combined values returned from the "du" command by the combined values returned from the "ls" command will give the total percentage of disk space in use.  In the example above, this value works out to 18% or a savings of 82%.  The customer was actually seeing a savings of 51% in their production environment, and this is just by using thin provisioning.  A-SIS, or deduplication, will be implemented soon, and it will be interesting to see what the disk usage numbers change to then.


Thanks for reading!