VMware Cloud Community
rbirri
Contributor
Contributor
Jump to solution

HA & SRM

Hello

Just a quick question about the interest of SRM.

Let's say I have a HA enabled cluster of 10 ESX hosts, 5 on site A, 5 on site B. Shared SAN storage.

Isn't HA enough for unpanned DR ?

Why should I use SRM (apart from the ability to test DR) instead of SRM ?

Thanks

Tags (1)
Reply
0 Kudos
1 Solution

Accepted Solutions
Jay_Judkowitz
Enthusiast
Enthusiast
Jump to solution

This is not possible as SRM requires the two sites to each be managed by their own VC. HA does not stretch across two VCs.

Quick question for you - what storage are using? I'm curious how you're getting the active/active storage so that if the building with the shared storage burns that you are getting HA to work immediately off of another copy?

View solution in original post

Reply
0 Kudos
3 Replies
Jay_Judkowitz
Enthusiast
Enthusiast
Jump to solution

Hi,

The ability to do regular, non-disruptive tests and then audit the results over time is absolutely critical. This should not be overstated.

You also get a much more complete recovery plan - what starts serially, what starts in parallel, when do I pause and call someone, when do I launch customized scripts, if necessary, etc... SRM also changes VM network settings if the IPs are not in place at both sides.

SRM also gets you a QOS guarantee on recovered VMs. Since you will only have half the servers at the time of recovery, deciding how the VMs are balanced in new resource pools and what VMs are suspended to make way is critical for making sure the VMs are able to do real work upon recovery.

And lastly, but most importantly, the big difference is replicated vs. shared storage. If you are using storage shared between both sites, but that is not replicated, you can very nicely handle single host failures, but when you lose a whole building - power, facilities, fire, etc.... you are dead in the water. SRM, along with help from storage replication makes both buildings completely self-sufficient.

One caveat here, is that you happen to have active/active shared storage, so that the storage appears RW at both sides, and if networking is stretched, this last point is moot for you. But, I mention it since most storage vendors out there do not provide active/active replication, thereby making a very sharp distinction between SRM and HA.

Hope this helps.

-Jay

rbirri
Contributor
Contributor
Jump to solution

Thanks for your quick response

Let's say I have a HA enabled cluster of 10 ESX hosts, 5 on site A, 5 on site B. Shared SAN storage. Maximum host failure is set to 2.

If up to 2 of the 10 ESX hosts go down, HA is triggered to restart the VM on another ESX server.

If more than 2 hosts are down, HA starts the VM with the higher priority set.

Fact : SRM and HA are 2 different processes.

So, if Site A burns, HA will start higher priority VMs on Site B and SRM will trigger the DRP scenario.

Which is bad I think, it must be HA or SRM but not the 2 at the same time.

In this case, having a single cluster shared on 2 physical sites is default by design, right ?

I shoud use 2 clusters Site A & Site B, HA to restore VM in case of a local ESX crash within a cluster and SRM for a buiding explosion.

Reply
0 Kudos
Jay_Judkowitz
Enthusiast
Enthusiast
Jump to solution

This is not possible as SRM requires the two sites to each be managed by their own VC. HA does not stretch across two VCs.

Quick question for you - what storage are using? I'm curious how you're getting the active/active storage so that if the building with the shared storage burns that you are getting HA to work immediately off of another copy?

Reply
0 Kudos