TristramCheer
Contributor
Contributor

Looking for advice on SAN selection for HA/DR

Hi All,

I'm currently in the planning stages of a Multi-Site HA cluster for work, I've so far got the server's sorted and the networking but the SAN is something I need a little help in selecting the right SAN, The idea of the cluster is to use FT across 2 DC's within the city linked by 10gbit Ethernet circuits and backup DMR. When a server/SAN arrays or the entire sites goes down the load/disk access is automatically transferred to the DR sites gear. The size of the array doesn't need to be huge - We're looking for a 2-4tb array that provides continuous availability/network raid.

I've had a quote from Dell about a pair of  Equallogic PS4100x's which was right in the sweet spot for us but reading up more about the devices I'm now under the impression that the array's can't do this outside of scheduled “replicas” but I’m awaiting an e-mail back from Dell about that. The integration into vCenter was really nice

The HP P4300 arrays are nice and seem to do what we want but I can’t get any info from HP directly about some of the smaller arrays like the 7.2tb array (BK716A) which is actually 2 x P4300 arrays with 3.6tb of disk in each array, Anyone know if you can take that and split the two arrays into two different locations?

So does anyone have any suggestions on a 2-4tb hardware SAN that can do replication around $15-20k USD?

Cheers

0 Kudos
5 Replies
JPK871
Enthusiast
Enthusiast

The idea is interesting, but I'm not aware of any designs that have used FT (I'm assuming your alluding to VMware Fault Tolerance) spanning sites.  This will be an interesting challenge to say the least.  With todays technology spanned clusters is still a difficult task to overcome, much less implementing FT.  You have a number of issues that you need to overcome when you start down this path the biggest obviously being storage and networking.  From the storage perspective you'll need an active-active set of storage arrays that will help to alleviate potential split-brain issues, from what I've read EMC vPLEX is supposed to be ready to do this but I'm not sure of the price point... and again this is just getting a stretched cluster to work.  Then you've got the network considerations you have to account for, i.e. latencies that to be overcome and you'll typicall be looking for layer 2 adjacency in this type of design as well.

There are a number of good articles around this that may be of use out there to help you work through your concepts.  I'd start by really digging into just understanding all of the technical limitations that exist today around stretched clustering, then move on to understanding if FT with this type of design will even work.

0 Kudos
TristramCheer
Contributor
Contributor

Hi JPK871,

I am talking about Vmware FT, The sites while in different locations are maybe not quite people around here are expecting. Things here in NZ tend to be smaller when entering the "enterprise" space. The sites are 5km apart and the connection between the two are set for HT, Each site has 2 switchs with each switch holding a connection to a switch at the other site either over fibre or over licensed microwave and have alot of capacity so for most purposes I consider them one location network design wise.

I've pretty much ruled out Equallogic at this point, Most things are pointing to no sync-rep between arrays so I'm left with the HP P4300 G2 systems. The Vplex is overkill and I shudder to think of the price of the most basic system given it's a full rack so I'll likly have a chat to HP next week and see if I can get a demo. The NetApp FAS2020 or 2040 look like it can also do what I want but pricing is scarce and is likly to be priced out of our budget to.

Any advice anyone has is greatly appreciated and any known design considerations for a ~5km spanned cluster would also be great! 

Cheers!

0 Kudos
JPK871
Enthusiast
Enthusiast

I'll be interested to see what you come up with.  Just to make sure I have it straight you are looking to have 1 HA cluster, that will be in two sites (although they are close in proximity and network latency).  Within these sites you will have physical ESX(i) hosts that are attached to independent storage arrays that are using some type of synchronous replication.  Then you want to enable FT on some critical VMs, with the Primary FT VM residing in site #1 connected to SAN #1 and the Secondary FT VM residing in site #2 connected to SAN #2.

If this is the case isnt there an issue with the manner in which FT uses the same VMDK on each VM?  I understand that there will be replicated datastores, but I dont see a way that vCenter would be able to leverage the replicated vmdk.

I think this is a really interested concept and want to learn from what you come up with, but I just think there are techical limitations today.  Please let me know your thoughts and I'll gladly bounce ideas back and forth.

0 Kudos
AndreTheGiant
Immortal
Immortal


So does anyone have any suggestions on a 2-4tb hardware SAN that can do replication around $15-20k USD?

The new MD36xx serie has this option.

But you can't have FT between a production and replicated LUN... it does't work. It needs that it is the same LUN.

Have you considered other solution like VM replication?

Andre | http://about.me/amauro | http://vinfrastructure.it/ | @Andrea_Mauro
0 Kudos
Josh26
Virtuoso
Virtuoso

Honestly this isn't as easy to do as you seem to be hoping.

At a minimum, it's an entirely enterprise solution looking to be implemented in a small business budget.

Site-based DR is really what "replication and SRM" is all about. FT really is aimed at "my server failed and I'd like to fail over to the server sitting on top of it".

Edit: The P4300 can to live mirroring. Again, this is based on the idea the san components are next to each other. It's not even supported to have the two units on different subnets.

The "site mirroring" functionality involves an artbitrator to do a failover, this is not a "live" solution that will support FT.

0 Kudos