BjornJohansson
Enthusiast
Enthusiast

How do you test SRM?

Hello all!

I would like to know how you guys out there test DR plans with SRM. Of course, all runs their plans in test mode. But is this enough for your disaster recovery policy?

We are discussing a full fail over of all Protection Groups. Sounds a bit dramatic but we want to test the whole infrastructure at the DR site. There are some business critical systems that will be very hard to get into the testbubble network. We want to make sure that we are actually able to work on our DR-site if the poo hits the fan. We have SLA on one hour for some systems.

How do you test your recovery plans? How do you manage failback if you are doing a planned fail over?

Please post!

Cheers!

/Björn

0 Kudos
3 Replies
jhiraldo
Contributor
Contributor

Depending on the storage that you have you can create a volume or lun with replicas of the VM's that you want to test the real failover.

I do this every week for my company, I perform a test and a real failover. Make sure that the vms on the lun/vol doesnt share storage with other production vms.

Create a recovery plan at your recovery site and execute it, the other plans can stay protected.

0 Kudos
Smoggy
VMware Employee
VMware Employee

the test bubble is basically a safety net in my opinion. I guess its main use is to allow customers to quickly sanity check:

- SRA's are working

- Recovery plan runs as expected

- VM's power on as expected

- there are no nasty surprises from any custom call out scripts you've embedded in your recovery plan.

Once things look good I find most customers will then work with their network teams to create some isolated VLAN's at the recovery site that can be mapped to the ESX clusters at that site as port groups (use obvious names so you can see which portgroups are test and which are real!!!) . The customer will then edit their existing recovery plans and change the test network entry (that aligns with the failover network) so that it points to the test VLAN. You need to work with the network teams to ensure you test vlan is just that, test, isolated, not able to route back to production etc etc.

here is an example from one of my lab setups:

0 Kudos
BjornJohansson
Enthusiast
Enthusiast

Thank you guys, I appreciate your input!

I will plan to do a real fail over for one LUN containing some test machines. Then primary reason for this is to practice fail back.

For test of recovery plans I've been a supporter for VLAN to do some testing in "the bubble". Unfortunately, in our case most of the critical systems rely on 3rd party communication lines that not easily can be re-directed into the test VLAN. We don't want to create a new problem with testing.

/Björn

0 Kudos