VMware Cloud Community
iforbes
Hot Shot
Hot Shot
Jump to solution

Failback without bringing back changes

Hi. I know the standard workflow of a failover/failback:

- Failover to Recovery Site

- Reprotect (reverse replication to original Protected Site)

- Failback (to original Protected Site)

- Reprotect (reverse replication to get environment to original state)

Now, during the first reprotect. SRM reverses replication. That means any changes that have taken place at the recovery site will be written to the original Protected Site storage. My question is, is there a way to perform a real failover (not test failover) and failback without bringing back changed data? Customer will be performing a bunch of testing while failed over, BUT they don't want to bring back those changes for fear that original production data could be compromised.

I'm not sure how this would be accomplished. Once again, test failover is not an option here.

Ultimately, we'd need to have SRM back to it's original state (ready for a Protected to Recovery site failover) after the failover is complete.

thanks

1 Solution

Accepted Solutions
vbrowncoat
Expert
Expert
Jump to solution

This isn't possible with SRM.

Just curious, why wouldn't running a test of the SRM recovery plan work in this case? That would let them test the workloads at the recovery site and not copy changes back to the protected site.

View solution in original post

0 Kudos
9 Replies
vbrowncoat
Expert
Expert
Jump to solution

This isn't possible with SRM.

Just curious, why wouldn't running a test of the SRM recovery plan work in this case? That would let them test the workloads at the recovery site and not copy changes back to the protected site.

0 Kudos
iforbes
Hot Shot
Hot Shot
Jump to solution

Thanks. They have a stretched L2, which is great as the vm's will retain their IP's. We entertained doing end-to-end application testing during a test failover but there are significant network architecture issues that prevent us from doing that. Also, the plan is to get real results from an actual DR failover. I've read of people facing the same issue as I'm facing and performing unsupported steps. I'm not sure I want to do anything unsupported and involving a lot of manual steps. I'll just let the customer know they cannot do a failover without reversing replication for the failback.

0 Kudos
vbrowncoat
Expert
Expert
Jump to solution

Even with stretched L2 they could have an isolated test network at the recovery site to use for testing. Being able to use the test functionality in SRM is really helpful.

I completely agree with and commend your decision to not go down the unsupported route.

0 Kudos
ThompsG
Virtuoso
Virtuoso
Jump to solution

In addition to this, if they just want to "really" confirm that SRM is working then make a DR recovery plan with a subset of Protection Groups - fail this over and failback. We do this as we are implementing new arrays to confirm that everything is working as expected. While a running a test of the recovery plan is good there is nothing like ensuring the array will handle the reversing of replication, etc.

Kind regards.

0 Kudos
iforbes
Hot Shot
Hot Shot
Jump to solution

I hear you. From an architecture perspective we aren't able to create the applicable test networks at the recovery site, and bubble networks will not provide us the accurate end-to-end application testing that is required. Believe me, I've been through all of those scenarios Smiley Happy. We're just not able (with this particular customer setup) able to do the type of robust application testing we require with a test failover. The business also wants to get the real feel for what an actual DR event will entail and look like. There are obviously a myriad of differences between a test and recovery, including shutting down production vm's, running certain scripts, the impact to replication being broken, ability for a successful mounting of the luns, etc. The test failover is handy, but not what is needed for what this customer is looking to achieve.

0 Kudos
iforbes
Hot Shot
Hot Shot
Jump to solution

That's the issue I'm having though Smiley Happy Customer wants a real failover and then return to pre-failover state WITHOUT reversing storage replication. They are fearful that reverse replication might bring back corruption.

0 Kudos
vbrowncoat
Expert
Expert
Jump to solution

What kind of corruption are they worried about?

Maybe they could do a combination of the 2? Run a planned migration of a subset of VMs to verify that it works they way they expected and run SRM RP tests to verify application functionality. Regarding that, SRM, VR and some SRAs support running tests with the sites disconnected. If they can power off or isolate their production environment that might allow them to run a test with the VMs connected to the production network.

0 Kudos
iforbes
Hot Shot
Hot Shot
Jump to solution

They'll be running test during failover. They're a little paranoid anytime someone mentions they're Protected Site storage will be overwritten with changed data (rightfully so). In an actual event this wouldn't be a consideration, but we're just testing (although a real test involving real data). The planned migration still uses actual real vm's with data they are paranoid of possible corruption on failback. Of course we could create fake vm's, but it defeats the purpose of testing actual real workloads.

Appreciate the feedback but even running a test with the vm's connected to the production networks defeats the purpose of not performing an actual failover followed by failback. Ultimately, it's that specific workflow using real data that is desired. I've communicated back to the business exactly what a failback will do, so there is no confusion. A decision is being made on how they want to proceed. It may just be more stringent backup procedures before the failover. Thanks very much for the feedback.

Sasidhar1234
Enthusiast
Enthusiast
Jump to solution

What are the challenges of removing VMs from replication post DR test and re-registering in production site. 

0 Kudos