PaulWestNet
Enthusiast
Enthusiast

vSphere 6, no datastores shown when turning on FT

Jump to solution

Hi guys,

I seem to be running into an issue while checking out vSphere 6 in my lab.

When I try to turn on FT for a VM via the web client, it opens a new window, then tells me there are no datastores available for the secondary VM, even though there is a fresh second host in the cluster just waiting to receive the secondary VM's data files.

I have HA turned on, but no shared storage (so HA complains about no heartbeat datastores), but FT is supposed to not need shared storage, so that shouldn't matter.

It feels like a bug in the web client. I can see both hosts datastores in other places in the client, but not while trying to turn on FT.

Any ideas?

1 Solution

Accepted Solutions
MKguy
Virtuoso
Virtuoso

When I try to turn on FT for a VM via the web client, it opens a new window, then tells me there are no datastores available for the secondary VM, even though there is a fresh second host in the cluster just waiting to receive the secondary VM's data files.

That's unfortunately not entirely true. FT in 6.0 still needs shared storage for the tiebreaker file, see:

http://www.wooditwork.com/2014/08/26/whats-new-vsphere-6-0-fault-tolerance/

You still need shared storage for the tiebreaker file but the .VMX and .VMDK files don’t even have to be on shared storage, they could be local disks.

-- http://alpacapowered.wordpress.com

View solution in original post

0 Kudos
18 Replies
MKguy
Virtuoso
Virtuoso

When I try to turn on FT for a VM via the web client, it opens a new window, then tells me there are no datastores available for the secondary VM, even though there is a fresh second host in the cluster just waiting to receive the secondary VM's data files.

That's unfortunately not entirely true. FT in 6.0 still needs shared storage for the tiebreaker file, see:

http://www.wooditwork.com/2014/08/26/whats-new-vsphere-6-0-fault-tolerance/

You still need shared storage for the tiebreaker file but the .VMX and .VMDK files don’t even have to be on shared storage, they could be local disks.

-- http://alpacapowered.wordpress.com
0 Kudos
vbrowncoat
VMware Employee
VMware Employee

In addition to the tie breaker file, the FT config file and the .vmx for the primary also need to be on shared storage.

Keep in mind too that FT doesn't have a mechanism to move the secondaries files so if they and/or the primary VMs VMDKs are placed on local storage, FT will be unable to restart a new secondary (and possibly primary) if the host with the local storage is unavailable (eg. 3 host cluster (A, B, C) - Primary (incl VMDK) on Host A, Secondary (inc VMDK) on Host B - if host A fails, secondary on host B will become primary but a new secondary will not be able to be spun up until Host A comes back online).

PaulWestNet
Enthusiast
Enthusiast

MKguy wrote:

When I try to turn on FT for a VM via the web client, it opens a new window, then tells me there are no datastores available for the secondary VM, even though there is a fresh second host in the cluster just waiting to receive the secondary VM's data files.

That's unfortunately not entirely true. FT in 6.0 still needs shared storage for the tiebreaker file, see:

http://www.wooditwork.com/2014/08/26/whats-new-vsphere-6-0-fault-tolerance/

You still need shared storage for the tiebreaker file but the .VMX and .VMDK files don’t even have to be on shared storage, they could be local disks.

Hi, MKguy,

I had totally missed that part of the information, it seems. I did look at that page at one point, but I must have been skimming and missed it. Thank you for pointing it out - it explains everything.

It's unfortunate that shared storage is still needed - even if it's so minimal. I wonder what happens if the shared storage croaks while FT is active? If it kills FT, then there's little point to using separate datastores for the primary and secondary VMDKs and stuff, since you'll still need HA shared storage to ensure everything stays running. If you've got HA shared storage, you might as well put everything on it. I'll have to test and see what happens.

0 Kudos
PaulWestNet
Enthusiast
Enthusiast

gs_khalsa wrote:

In addition to the tie breaker file, the FT config file and the .vmx for the primary also need to be on shared storage.

Keep in mind too that FT doesn't have a mechanism to move the secondaries files so if they and/or the primary VMs VMDKs are placed on local storage, FT will be unable to restart a new secondary (and possibly primary) if the host with the local storage is unavailable (eg. 3 host cluster (A, B, C) - Primary (incl VMDK) on Host A, Secondary (inc VMDK) on Host B - if host A fails, secondary on host B will become primary but a new secondary will not be able to be spun up until Host A comes back online).

Hi, gs_khalsa.

I know the recommendations is to have 3 hosts to ensure there's always a primary and secondary available if a host fails. From reading, I got the impression that if the host with the primary went down, the secondary would become the new primary and a secondary would spin up on the 3rd host, and if the host with the secondary went down, a new secondary would spin up on the 3rd host - all using the same mechanism that was used when originally turning on FT - but I can't test that at the moment, since I only have 2 hosts to test with at this time.

It does make sense that it would work that way, though, given FT in vSphere 6 is almost a share-nothing design now.

0 Kudos
vbrowncoat
VMware Employee
VMware Employee

Please make sure to read my earlier post in detail. The primary and secondary VMDKs can be stored on local datastores, but this will restrict where the FT VMs can run.

0 Kudos
PaulWestNet
Enthusiast
Enthusiast

Hi, gs_khalsa.

I understand. The question is what happens when a host goes down - if vSphere is able to spin up a new secondary on local storage of a 3rd host. Logically, it would make sense for vSphere 6 to be able to do this, using the same mechanism it used to set up the secondary originally - it's have to copy the primary's VMDK's to a new host and bring FT back online (or, if the primary is dead, then it'd have to make the secondary the new primary, then create a new secondary on a new host with local storage again). How automatic this is is unknown to me at this point, since I haven't tested it, now have I seen it mentioned anywhere (though I skimmed a lot of information, so I could have missed it).

Have you tested this scenario with the new vSphere? I'd be very interested in hearing what happened. If I can get my hands on a third host, I'll test it myself - but first I need to get my 2-host FT going *grin*.

0 Kudos
joergriether
Hot Shot
Hot Shot

gs_khalsa schrieb:

Please make sure to read my earlier post in detail. The primary and secondary VMDKs can be stored on local datastores, but this will restrict where the FT VMs can run.

Hi gs_khalsa,

are you sure about that (i am speaking of the primary)? Because when i want to enable ft on a primary vm which is located on a esxi host with local and shared storage it says it can ONLY use the shared storage.

I was NOT able to create and run any FT 6.0 protected vm without having the primary on shared storage. Secondary VMDKs i could locate on local storage - that worked. But NEVER primary. So - please clarify - did i do something wrong?

Best regards,

Joerg

0 Kudos
joergriether
Hot Shot
Hot Shot

PaulWestNet schrieb:

It's unfortunate that shared storage is still needed - even if it's so minimal. I wonder what happens if the shared storage croaks while FT is active?

I think we just found out in our testlab - the ft protected vm will freeze when you kill the shared storage. That is sad. Because there is still a single point of failure - even with ft 6.0: The shared storage.

Can anyone confirm what we saw?

Best regards,

Joerg

0 Kudos
vbrowncoat
VMware Employee
VMware Employee

The FT VM will likely freeze if the shared storage is lost. This is why it is important (as it is for all VMs) that your storage, and connectivity to it be reliable.

Only the primary vmx file, FT config file and tie-breaker file need to be on shared storage. This is to ensure that both hosts/VMs can access them. If they were allowed to be stored on local storage that would be the SPOF.

Keep in mind that if you store FT files on local storage FT can't move those files so if you have a 3 host cluster (A, B, C), primary running on A with VMDKs stored locally, secondary running on B with VMDKs stored locally, and either A or B go down, FT will be unable to recreate a new secondary until the host (and it's files) come back online. Does that make sense?

0 Kudos
joergriether
Hot Shot
Hot Shot

Hi,

yes, makes sense, ofcourse.

But you did not answered my initial question. In my testlab, vsphere did not allow me to place the primary vmdks on local storage. But you are saying the primary can indeed place the vmdks on local storage, only the vmx, config and tiebreaker have to be on shared. I am saying what i found out is all primary stuff HAS to be on shared storage. Only the secondary VMDKs can be placed on local storage. Can you confirm that?

Best regards,

Joerg

0 Kudos
vbrowncoat
VMware Employee
VMware Employee

To turn on FT with the primary VMs VMDKs on local storage the primary VMs vmx file has to be on shared storage when FT is turned on. I've just tested and confirmed this in my lab.

0 Kudos
PaulWestNet
Enthusiast
Enthusiast

joergriether wrote:

Hi,

yes, makes sense, ofcourse.

But you did not answered my initial question. In my testlab, vsphere did not allow me to place the primary vmdks on local storage. But you are saying the primary can indeed place the vmdks on local storage, only the vmx, config and tiebreaker have to be on shared. I am saying what i found out is all primary stuff HAS to be on shared storage. Only the secondary VMDKs can be placed on local storage. Can you confirm that?

Best regards,

Joerg

I noticed the same thing. FT doesn't move the .vmx file from local storage to the shared storage it has been pointed to, so you must do it manually. Why it doesn't do it for you, who knows.

I haven't tested moving the .vmx file manually, since the need for shared storage negates any benefit of the new feature anyway, and adds new limitations to FT recovery if a host goes down. Maybe for performance, the new local storage stuff can be of benefit, but if you have a decent backbone to the storage, there's little to be gained by local storage anyway. I think most people were excited to have FT no longer need shared storage at all, since shared storage that has redundant switching and heads is one of the more expensive components of a vSphere roll out (there's no point to FT on shared storage that isn't redundant - you'd just be moving the single point of failure from the host to the storage). The vision of having two stand-alone hosts utilize FT on their own local storage was exciting. Sadly, it's not there yet. As far as I can see, the only useful new FT feature is the multi-CPU capabilities.

0 Kudos
Bill_Oyler
Hot Shot
Hot Shot

Hi guys,

I ran into the same "gotcha" with Fault Tolerance in 6.0.  I think the confusion is caused by the wording in the VMware white paper "What's New in the VMware vSphere 6.0 Platform" (http://www.vmware.com/files/pdf/vsphere/VMware-vSphere-Platform-Whats-New.pdf).  The section about Fault Tolerance says, "...also increases the options for storage by enabling the files of the primary and secondary virtual machines to be stored on shared as well as local storage."  This makes the reader think that shared storage is not required, when in fact it clearly is required (even if just for the tie-breaker and VMX files).  I suspect quite a few people will read that document and arrive at the same (erroneous) conclusion that they only need local storage to use FT.  This statement should be revised to say something like "......also increases the options for storage by enabling VMDK files of the primary and secondary virtual machines to be stored on local storage, while shared storage is still required for other essential FT-related files."

Also, I found another discrepancy in this white paper.  It says FT is "included with VMware vSphere Essentials Plus Kit and higher editions of vSphere".  However, the vSphere Availability guide says:

"FT and legacy FT are not supported in vSphere Essentials and vSphere Essentials Plus."

Also, the "Compare vSphere Editions" makes no reference to FT being included in Essentials Plus.

Does anyone know the correct answer?  Does Essentials Plus support any form of FT (legacy or new FT)?

Thanks,

Bill

Bill Oyler Systems Engineer
0 Kudos
vbrowncoat
VMware Employee
VMware Employee

Thank you for that feedback. I'll work to get the whitepaper updated.

The latest licensing information is as follows (you are correct that FT is not included in vSphere Essentials & Essentials Plus):

FT is licensed in the following products:

vSphere Standard & Enterprise - 2 vCPU VMs

vSphere Enterprise Plus - 4 vCPU VMs

vCloud Suites - 4 vCPU VMs

ROBO Std - 2 vCPUs

ROBO Adv - 4 vCPUs

0 Kudos
Bill_Oyler
Hot Shot
Hot Shot

Thanks!  I appreciate the updated info.

Bill Oyler Systems Engineer
0 Kudos
mcelliers
Contributor
Contributor

Hi I am Sitting now with a huge problem..  Because I misread the Fault Tolerance options that I could use local storage, I have sold this solution to my Client.

I gave them VMWare Standard ( 4CPU) 2 sockets for two Servers.

I have also Sold them 2 Servers for the EXSi hosts. Each with Mirrored SSD Drives ( 500GB) and also RAID 5 local Storage (2TB).

I also have another server with VCenter on Windows 2012 Standard.

I only needed to run One VM in Fault Tolerance and really thought that I would be able to use local Storage for this Scenario.

My problem is that I need to get this to work or my client wants his  money back.

The other problem is that If i introduce a shared storage in the network it defy's the purpose in using fault tolerance , because if the Shared storage server fails, the whole Fault Tolerance is broken. So there is again a single point of failure.

0 Kudos
PaulWestNet
Enthusiast
Enthusiast

The quickest solution is to set up StarWind Virtual SAN in HA mode on 2 VMs, one on each host (make sure HA doesn't try to move the VMs onto the same host using policies). It's free, even in production mode. You will need to deploy two more Windows VMs to install StarWind Virtual SAN into, and you will need a NIC for replication. I assume you have at least one 10Gbps NIC on each host, since you're wanting to do FT (you DO have a 10Gbps NIC on each host, right? FT is completely worthless without it, even with only 1 VM using it. See other threads on the forum about people trying to use 1Gbps links for FT traffic and the speed issues it results in), so you could use that (FT won't suck up the entire pipe, and neither will the replication, so that should be ok).

In a failure scenario, if host 1 goes down and takes one StarWind Virtual SAN instance with it, and your FT master, that's ok - host 2 can still get to the data via the second StarWind Virtual SAN instance, and all is well. No single point of failure.

To get StarWind's Virtual SAN for 2 nodes on VMWare, you have to email them. Once you get to their site, you'll be able to download a 1 node version, and there's reference to having to email them to get a 2 node license for free. I've done it, it's painless.

You only need to put the VM's basics on the HA storage StarWind provides, which will keep HA replication traffic to a minimum, thus not messing with FT's traffic (the need for shared storage in FT is only for a tiny file to keep things in sync, not the whole VM). Put the VM's VMDKs on your local storage and let FT deal with replicating it between the hosts as writes occur, rather than let the HA storage replicate that traffic (FT6 will replicate it either way, even if the VMDK is on shared storage that both hosts can see, from what I understand). That means your HA storage doesn't need to be very big. In fact, it can be exceedingly tiny, since very little is actually stored on it. That will reduce the load on the whole HA storage setup (the smaller it is, the less likely it'll have any problems, too).

My only final note is that RAID5 with 2TB drives is slow suicide. I assume this will be used for file shares, since it's slower storage. RAID5 on large drives is fine, until a drive fails. The rebuild process takes a fair bit of time with large drives, and if another drive is found bad during that rebuild, your array is toast. The larger the drives, the more likely you'll find a bad sector that wasn't used before (thus not detected previously) during a rebuild. Worse, when a drive fails, it's likely just at that age when all of them are near failure, since they're all wearing at the same rate. RAID6 would have been a better choice here.

All of this gets even worse if you've used desktop-class drives, rather than enterprise storage, since a failed drive will likely bork the array for 30-60 seconds while the desktop drive's firmware messes around trying to get the data - in the meantime, the controller takes the drive offline and things time out out because the array is non-responsive for that time frame. Your host will stall during this delay, too, causing the VMs to freeze and possibly crash, if the delay is too long (Windows will only wait so long before it decides it lost it's disk). Some controllers don't know what's happening and take all of the drives offline, leaving you in the dark as to which drive has an issue. Enterprise drives not only fail less, but when they do, they give up faster to avoid time outs and all the associated nastiness.

jlorelle
Contributor
Contributor

mcelliers,

Did you get a satisfactory response to this? I was just getting ready to recommend a similar setup for a client when I learned you still need shared storage. Very frustrating.

0 Kudos