VMware Cloud Community
TomSCL
Contributor
Contributor

Create a redundant system.

Hi,

We've got a server running vmware esx, the server has a raid card installed and the datastore is on drives connected to that card. Now that's all well and good except we have just 1 point of failure, if that machine dies or the raid fails then it's game over so I'm trying to work out how to create a redundant system?

The first thought is have another system which is a similar setup and some how mirrors the current system so that if the current one dies then this other one can (hopefully automatically) take it's place, is this a feature that is provided anywhere in the vmware suit (vmotion prehaps?) or is there some other solution to achieve that or is it just a bad idea?

The second idea does seem some what better which is to get some sort of NAS storage involved so that 2 servers are connected to a NAS device then if one server fails then the other can take over, I assume this is perfectly possible? However that would still leave with 1 point of failure (the NAS) which I assume would be fixed by adding another "mirror" NAS device leaving us with a situation where if any one server and any one NAS device failed then it would still keep going, now I assume such a setup must be possible? If so how and how would we mirror the storage, would that be done by a vmware product or would that be down to some features in the NAS?

I'm only just starting to come to terms with all this technology so I'm sorry if I've got anything wrong or I'm asking silly question but any help would be really appreciated.

One final note is that cost is an issue so I can't really look at $20,000 solutions.

Many thanks,

Tom

0 Kudos
10 Replies
rriva
Expert
Expert

Hi Tom,

IMHO the simpliest and cheapest solution is to buy another server similar to the one you're using and use VizionCore Vreplicator to create a clone of the VMs running on the first server to the second one.

In case of server failure you have only to manually restart the VM on the DisasterRecovery (the new one) server.

There will be many other solution like NAS Mirroring, iSCSI SAN (mirrored), shared storage with DFS or a DRS and HA Cluster, but if you have a low budget, this could be also a very fast setup.

Hope this help

Bye

Riccardo Riva

VCP,RHCE,FCNSA

If you found this or other information useful, please consider awarding points for "Correct" or "Helpful". Thank You!

RRiva | http://about.me/riccardoriva | http://www.riccardoriva.com
0 Kudos
Erik_Zandboer
Expert
Expert

Hi,

Using shared storage is the way to go. If you want to be able to survive NAS failure without too much trouble and cost, consider using two identical NASses, and use a software mirror within the operating system. You could do this for all VMs, or just the VMs that are very important to you.

I have tested with this (win2K3), and it works wonderfully well! Just unplug (any of the two) NASses, the VMs freeze for just a second, then resume with a popup that the mirror has been broken. Plug the NAS back in, resync the mirror disks and you are set again for the next NAS failure.

You could also look at free stuff like XVS ( ). In this setup you create a VM (appliance) on both ESX nodes, each having a large (local storage based) disk. Each appliance is an iSCSI node, with synchronous replication between them. Failure of an ESX host will down one of the appliances, the other one notices and takes over the IP address of the failing appliance, and so the storage continues to function. When the failing appliance comes back online, you can resync it and replication (and failover) resumes. Not the best performer, but pretty resillient and I think the cheapest you can get...

Visit my blog at

Visit my blog at http://www.vmdamentals.com
0 Kudos
TomSCL
Contributor
Contributor

Ah thank you very much that seems very insteresting, I'm looking at vReplicator now but what I can't see is does it maintain an up to date replicated machine image or does it just replicate it when requested to. It would only be useful to us if it constantly syncs the replicated machine so that when it becomes live it is up to date.

0 Kudos
Erik_Zandboer
Expert
Expert

Hi,

With vReplicator your replicas are always "behind" the production VMs. Even worse, every time you start replicating a VM, snapshots have to be taken and removed. Personally I am not a big of that, and it is not cheap as well (vReplicator licenses per VM).

The solutions using software mirrors or the XVS make your environment resillient to failure without any form of data-loss. I must admit it takes some courage to run your VMs from XVS. The software mirror is a lot less scary in my opinion, and you can actually decide which VMs should run on both sides, and which VMs run single sided. But remember, software mirrored VMs are the same as "shared storage", so forget about vmotion (if you have enterprise ESX lics). XVS does enable you to vmotion (if you have the vmotion required licences and setup of course).

Visit my blog at http://erikzandboer.wordpress.com

Visit my blog at http://www.vmdamentals.com
0 Kudos
Texiwill
Leadership
Leadership

Hello,

XVS or HP Lefthand Networks VSA will do what you want to do (Lefthand is pricey) and will handle multiple volumes. XVS is free but only handles one volume.

FOr two nodes that is a good set of options.

Of you can use in Guest duplication from Guest to Guest running on different nodes. You do not need to limit yourself to just what ESX can do, what the Guest can do can also be used.


Best regards,
Edward L. Haletky
VMware Communities User Moderator
====
Author of the book 'VMWare ESX Server in the Enterprise: Planning and Securing Virtualization Servers', Copyright 2008 Pearson Education.
Blue Gears and SearchVMware Pro Blogs -- Top Virtualization Security Links -- Virtualization Security Round Table Podcast

--
Edward L. Haletky
vExpert XIV: 2009-2023,
VMTN Community Moderator
vSphere Upgrade Saga: https://www.astroarch.com/blogs
GitHub Repo: https://github.com/Texiwill
0 Kudos
azn2kew
Champion
Champion

As mentioned, you really need a shared storage (iSCSI, NAS, FC) so that all VMs can be VMotion across multiple host using VMware ESX 3.5 (HA, DRS, VMotion) features. You need at least two ESX hosts to configure this to redundant for your VMs. Once you have setup all VMs to run under shared storage, you can work on backup solutions such as Veeam, vRanger, esXpress, VISBU, VCB etc..to backup your VMs regulary depends on your policies.

If you need a DR site for your storage/vmdk replication, than using vReplicator or other products that does it for you and it depends on your storage vendor sometimes their already have that features in placed. So, your concerns can be address with the right solutions and budget as described.

I like Xtravirt VSA, this gear is towards SMB with 2 ESX hosts and provide instant storage failover and it works well for small shop llike this. Give it a try at www.xtravirt.com for details. Other cheap options are Openfiler, FreeNAS, Solarwinds iSCSI target...etc.

If you found this information useful, please consider awarding points for "Correct" or "Helpful". Thanks!!!

Regards,

Stefan Nguyen

iGeek Systems Inc.

VMware, Citrix, Microsoft Consultant

If you found this information useful, please consider awarding points for "Correct" or "Helpful". Thanks!!! Regards, Stefan Nguyen VMware vExpert 2009 iGeek Systems Inc. VMware vExpert, VCP 3 & 4, VSP, VTSP, CCA, CCEA, CCNA, MCSA, EMCSE, EMCISA
0 Kudos
TomSCL
Contributor
Contributor

Thankyou all for your replies, you've given me plenty to look into although I don't pretend to understand the names and acronyms mentioned yet I'm sure I'll work it all out sonn enough. Thanks again Smiley Wink

0 Kudos
jsehlms
Contributor
Contributor

Going with another identical host and using a SAN/NAS device would be ideal as you could also then do your upgrades/maintenance to one host during the day when everything is running in production.

I guess it comes down to what kind of disaster are you worried about. If it's the SAN controller going bad and losing a whole day of production (unfortunately been in that boat before) then getting a box with dual controllers will allow you to run while you replace the failed controller. If a drive goes, you have hot spares, or just spares on hand.

If the disaster you're worried about is an explosion/fire/etc in your datacenter, then your only option is a remote synch of all data.

How much downtime is too much in your organization? You can throw all kinds of money at it and make it a 24x7x365 operation if you want, but if your able to have downtime in the event of a physical location disaster, then maybe you don't need to throw that money into the remote synch?

Jason

Jason VCP# 29919
0 Kudos
TomSCL
Contributor
Contributor

I'm looking at the XVS stuff and it seems very very interesting, can I just check that I've understood it?

Am I right in thinking that you have 2 machines, each with their own local storage then when you run XVS you treat the 2 local stores as 1 network storage device. Am I also right in thinking that they are synchronise so that if one dies all the data is still there?

If all that is right then it sounds great but my second question is if you have a vm running on one system with this setup and this system dies am I right in presuming that I could quite easily and quickly get that system vm running on the other ESX server without vMotion? The vMotion thing looks great except I don't think we can afford 2 £5,000 Infrastructure enterprise licenses with the new hardware right now (you can't buy the vmotion stuff on it's own can you?)?

Sorry for the silly n00b questions and thanks again for all your help so far!

0 Kudos
StevoIBM
Contributor
Contributor

Hi Tom,

I would at first look at second server solution for now that can be manually brought up if the primary server fails. Have the secondary server mirrored and ready to take over, it does not necessarily need to be the exact server just something that will manage to run the show until the primary server is repaired and restored,

Because of your budget I don't think a redundant NAS solution with a second server would be your best bet. As a starting point for DR model a second mirrored server would be the most budget conscious course of action.

Hope that helped at least give you an idea where to start.

Cheers,

Steven

0 Kudos