Kevin-Graham
Enthusiast
Enthusiast

Multiple storage on multiple sites

Jump to solution

Hi all

I have two sites that a VMware cluster will reside across. I will have storage in both locations.

To keep ISL traffic to a minimum I would like to have the VMs that are residing on the hosts in Site A to stay there (only DRS amongst themselves), pointing to storage in Site A; the VMs that are residing on the hosts in Site B to stay there, pointing to storage in Site B.

I would like HA be allowed to move them however.

Is there any way of doing this?

TIA

Kevin

0 Kudos
1 Solution

Accepted Solutions
jhanekom
Virtuoso
Virtuoso

I get what you're trying to do, and have been wondering about this myself.

The product itself does not allow you to group hosts within a cluster. The only "cheat" I can think of is to trick ESX into believing that the hosts aren't VMotion compatible.

Provided you have recent processors, how about enabling NX on the processors of site A and disabling it on the processors on site B? This way, DRS won't consider the servers in site B to be candidates when it needs to distribute load in site A. However, HA will still function, as it doesn't particularly care about whether hosts are have the NX bit set or not.

View solution in original post

0 Kudos
8 Replies
polysulfide
Expert
Expert

I don't see HA working across sites well. Without storage motion, you would need to be using iSCSI to do HA. iSCSI would allow you to run a VM at a different site but I doubt performance would be great. Unless your VMs have a complicated multi-homed netowrk configuration, you would need your two sites to be on the same subnet for HA to be seamless. I don't see that as being a reality either.

Here's what I would do:

Setup a separate cluster at each site.

Virtualize your Virtual Center Server so it can be replicated to your failover site.

Setup a different cluster for each site and have local HA

Develop a snapshot/replication strategy to get copies of your VMs to each site.

Document the steps it will take to make your backup site live. Your machines will end up with new IP addresses, etc unless you have some sophisticated network failover.

HA between sites is doable but is beyond the scope of a forum.

Ask your vendor if they provide Virtualization-based DR consultation. Be careful, everyone is trying to leverage into this market. Many of them have no experience. I can reccomend Greenpages or MSII. Neither of them will be cheap but they can help you get what you're after.

http://communities.vmware.com/blogs/polysulfide

VI From Concept to Implementation

0 Kudos
Cloneranger
Hot Shot
Hot Shot

The DRS stuff your asked about is simple enough,

The easy way to do it is just to have a resource your VMs need on site A that only exists on the hosts on site A, DRS wont try to move it then,

Storage is the thing that normally decides this, ie if the VMFS volume your machines are stored on is only attached to hosts on site A, DRS cannot and therefore will not move machines to site B,

You can even do this with a vSwitch, say all your VMs on site A have their NICs on 'Virtual Machine Network Site A', the fact that the vSwitch does not exist on the hosts on site B will prevent the migration of machines there,

However

Getting HA to work in this scenario is the tricky thing,

Because I believe in order to get this going you would need to stretch your SAN over the two sites, LUNs would have to be available on both sites,

How is your storage setup?

0 Kudos
Rodos
Expert
Expert

Kevin, interesting problem.

You talk about the storage at each site as if they are different. That is some VMs point to storage in site A and others to the storage in site B. If you are going to split the cluster across the two sites you will need to run a metro-cluster (vendors call them different things). So the storage is effectively the same at both sites.

Once you get the storage right you are then trying to run a split DRS and complete HA but have them configured differently. As both are set at the cluster level I can't see a way of doing that. HA could bring the systems up on any host that was alive. When connection is lost between the two sites is each one going to start the other systems, giving you two running copies.

Maybe what you are trying to do is run two clusters (with their own VC), each of which provides DR for the other. The issue then is automating the activation or failover of one site to the other. There are technologies to do this, and VMware are bringing out SRM (SIte Recovery Manager) for this very purpose.

Maybe provide some further details of your environment.

Considering awarding points if this is of use

Rodos {size:10px}{color:gray}Consider the use of the helpful or correct buttons to award points. Blog: http://rodos.haywood.org/{color}{size}
jhanekom
Virtuoso
Virtuoso

I get what you're trying to do, and have been wondering about this myself.

The product itself does not allow you to group hosts within a cluster. The only "cheat" I can think of is to trick ESX into believing that the hosts aren't VMotion compatible.

Provided you have recent processors, how about enabling NX on the processors of site A and disabling it on the processors on site B? This way, DRS won't consider the servers in site B to be candidates when it needs to distribute load in site A. However, HA will still function, as it doesn't particularly care about whether hosts are have the NX bit set or not.

0 Kudos
Rodos
Expert
Expert

HA between sites is doable but is beyond the scope of a forum.

I can't see why that would be beyond the scope of this forum. There are a lot of very experienced people here who have done a lot with DR and VMware. Yes, the devil is in the detail but these are great things to be discussing. I would not rely only on the forum (a problem to common) but I would not discount it either.

Ask your vendor if they provide Virtualization-based DR consultation. Be careful, everyone is trying to leverage into this market. Many of them have no experience. I can reccomend Greenpages or MSII. Neither of them will be cheap but they can help you get what you're after.

Correct, experienced consultancy is the an effective way to implement these types of solutions (keeps me in a living). Everyone is trying to leverage into lots of markets. As always you need to choose well.

Rodos {size:10px}{color:gray}Consider the use of the helpful or correct buttons to award points. Blog: http://rodos.haywood.org/{color}{size}
0 Kudos
Kevin-Graham
Enthusiast
Enthusiast

Hi all,

Thanks a million for the excellent replies.

At the moment jhanekom's little cheat idea of setting the NX bit looks like the option, but I should explain the setup better.

Site 1 - 3 x ESX servers (blades) and SAN storage

Site 2 - 3 x ESX servers (blades) and SAN storage

Site 1 & 2 will each have a VC server clustered using MSCS

Full fibre connectivity between both sites (Essentially 1 big storage area network)

When I talk about HA, I do not mean a full site failure, say a blade or blade enclosure dying. There are other things in place to look after a storage failure.

The ideal solution would be 2 DRS clusters inside a HA cluster but obviously this is not available.

TIA

Kevin

0 Kudos
Rodos
Expert
Expert

Great idea on using the processors but it is a "cheat". You are going to have to be sure to always start the VM on the right server to get it onto the right set of servers. Its just all to messy. Not a key to reliability.

Rodos {size:10px}{color:gray}Consider the use of the helpful or correct buttons to award points. Blog: http://rodos.haywood.org/{color}{size}
Rodos
Expert
Expert

So one VC running in a cluster across the two sites?

Are you going to run independent network links or is one primary and the second side gets its network access through the fiber link. The networking can get tricky too.

If for the HA failure you are not covering a site failure but rather a single blade, then you want the machine restarted on the same site. If its an enclosure failure its really a full site failure because you don't have anything to run them on.

As you say, you are trying to have two DRS clusters inside one HA cluster, which is not there.

Still think you should be looking into two active sites that passive for each other with two VCs.

Keep us posted, sounds like some fun.

Rodos {size:10px}{color:gray}Consider the use of the helpful or correct buttons to award points. Blog: http://rodos.haywood.org/{color}{size}
0 Kudos