VMware Cloud Community
fcapizzo
Enthusiast
Enthusiast

HA Configuration for primary and failover sites

I am currently working with a client that would like to use HA in a specific manner. Let's say that we have 5 ESX 3 servers at the datacenter and another 5 running at a DR site, and the sites have a fast connection between each other. Now let's say I add all 10 ESX hosts to a cluster with HA enabled. Is there any way to make rules for HA that define a primary set of servers to use for HA? For example, if a host (or even 2 hosts) go down at the primary datacenter, we wouldn't want the VMs to start up at the DR location.

I know there's the das.defaultfailoverhost advanced option for HA, but I don't know if you can configure it for a group of servers.

0 Kudos
6 Replies
surferdave
Enthusiast
Enthusiast

HA only works within data centers and will only work if ALL ESX servers can see the shared VMFS LUN, so unless you are running iSCSI or Fabric across a WAN, HA wouldn't work at the DR location.

0 Kudos
Eddy
Commander
Commander

No shared storage, no HA...

Go Virtual!
0 Kudos
fcapizzo
Enthusiast
Enthusiast

Yes, I realize that shared storage is a necessity for HA. Perhaps I should have clarified in my example. So now I will; let's also assume that shared SAN storage is available to ESX servers at both sites.

0 Kudos
fcapizzo
Enthusiast
Enthusiast

Anyone have any ideas on this? Personally I don't think that the current VI3 configuration allows for what I'm asking, but I just wanted to verify with everyone here. Also, if anyone is trying to do this kind of thing between 2 datacenters and has their own solution I'd like to hear what you've done.

0 Kudos
frankdenneman
Expert
Expert

I was asked the same question by a client.

Ít's not exactly what you want, but maybe it will give you a few ideas.

I "solved" this by creating two cluster, one in each datacenter.

Only host failover occurs inside the cluster, this way the VM load of host b from cluster 1 will be picked up by host a or host c when a failover occurs.

Each HA cluster has disposal of a "local" SAN. The LUNs presented to cluster a are being replicated to the SAN at the other datacenter and vice versa. Cluster 1 uses primarily SAN 1, Cluster 2 uses primarily SAN 2.

When a SAN fails, the customer can issue a failover command on the san. This way all ESX hosts from both clusters will access one SAN. When a site\datacenter fails a failover will be issued and the other cluster can startup VM's, which ran on the failed cluster.

It's not fully automatic, which was a requirement for my customer. You don't want to accidentally failover a complete san and ESX cluster load.

I left lots of details out, PM me if you want some more info.

Blogging: frankdenneman.nl Twitter: @frankdenneman Co-author: vSphere 4.1 HA and DRS technical Deepdive, vSphere 5x Clustering Deepdive series
fcapizzo
Enthusiast
Enthusiast

Frank_D, I've done something similar to your setup at a previous client's site. It's always going to be a manual process to fail over a replicated LUN, and we had also documented the steps needed to have the ESX servers at the other site recognize the failed-over LUN and subsequently the VMs running on those VMFS volumes.

At this point I'm mostly certain that the idea I had proposed in the first post is simply not doable with the way HA works right now. So for the time being, failover methods with manual intervention are required.

0 Kudos