VMware Cloud Community
tekhie999
Contributor
Contributor
Jump to solution

Question re FTT=0 for non-essential VMDK's

hi all

I have a VSAN Datastore that i need to deploy some large SQL VM's on.  To minimise the amount of  space used by 'replicas' for non-essential drives, such as the backup drive on the sql server, i was contemplating creating a storage policy with FTT=0.  I would apply this policy to the SQL Backup drives - lets call it the F drive.  FTT=1 for all other drives.

My question is to understand the impact to the availability of the F drive should a disk / host fail that some of the F drives components are running on.

If i have a FTT=0 policy applied to a VMDK (F drive), and a piece of hardware that is running some of the F drives components fails, the VMDK will become unavailable.

If this situation occurred, is it still possible to perform a resync of the components that are on the failed piece of hardware so that they are rebuilt elsewhere in the cluster ? So whilst the VMDK will not be continuously available as would be the case of FTT=1, after a brief period of unavailabilitty while the components are rebuilt, the components on the failed piece of hardware would be rebuilt elsewhere - therefore avoiding the need of creating a new VMDK

Is that assumption correct ?

thanks in advance

chris

0 Kudos
1 Solution

Accepted Solutions
GreatWhiteTec
VMware Employee
VMware Employee
Jump to solution

We don't recommend using FTT=0 for this, and many other reasons. You won't be able to rebuild the object since the disk/host will be unavailable. You could rebuild the object if you had one copy available (FTT=1), but in this scenario that copy is the only copy and not available.

You are also assuming that the VM is up and only lose the vmdk. If you lose the host where the VM is running (or reboot for that matter), HA will restart the VM, and since that one vmdk is unavailable, you are likely to not be able to boot that VM unless you remove drive F.

We recommend using FTT=1 at a minimum. If you are worried about space for that object, you could use Raid5 EC for that VMDK if you have all-flash.

Hope this helps.

View solution in original post

0 Kudos
3 Replies
GreatWhiteTec
VMware Employee
VMware Employee
Jump to solution

We don't recommend using FTT=0 for this, and many other reasons. You won't be able to rebuild the object since the disk/host will be unavailable. You could rebuild the object if you had one copy available (FTT=1), but in this scenario that copy is the only copy and not available.

You are also assuming that the VM is up and only lose the vmdk. If you lose the host where the VM is running (or reboot for that matter), HA will restart the VM, and since that one vmdk is unavailable, you are likely to not be able to boot that VM unless you remove drive F.

We recommend using FTT=1 at a minimum. If you are worried about space for that object, you could use Raid5 EC for that VMDK if you have all-flash.

Hope this helps.

0 Kudos
tekhie999
Contributor
Contributor
Jump to solution

Hi

thanks for the info .. very useful!

Just out of interest - if  FTT=1 for a VM, and a Host that is running some of that VM's components fails  - my VM will keep running as 2 out of the 3 components are still available.

Should i be unlucky enough to have a physical disk fail in another Host, before the components on the failed host are rebuilt, if 1 one my remaining 2 components for a vm are on that failed disk either the a) VM will be unavailable (if homespace is affected), or, b) a vmdk will become unavailable (if a VMDK is affected).

Would the VM or VMDK then recover itself and become available again once the components have been rebuilt when either a) the failed Host comes online again, or b) the failed disk is replaced ?

Or is it such that in the event of a double failure, a rebuild is required ?

Im just trying to get an understanding of the impact should i suffer a double failure

Thanks

Chris

0 Kudos
admin
Immortal
Immortal
Jump to solution

Hi

host failed and disk failed are different scenario.

host failed

component in hosts are marked as absent, it will start rebuild after 60 min. so vsan cluster will wait for 1 hour until host become online.

disk failed

once disk is faulted, component on faulted disk will be marked as degraded, it means it will rebuild it from good copy immediately. another fault during this time usually not happen. if it does you are really unlucky.

for double fault on FTT=1, no way to make VM accessibile yourself. if you are afraid, use FTT=2 or 3

you need to ask GSS if they can recover when you face double fault (when both data components are not available)

0 Kudos