VMware Cloud Community
MattG
Expert
Expert
Jump to solution

How does RecoverPoint handle SRM VM test and real failovers?

Trying to understand how EMC's RecoverPoint handles SRM VM failovers.   Does it treat test and real failovers differently?    For example,  for test failovers does it use SAN snapshots and for real failovers does it promote a LUN since an outage could be an extended period?   Also, where does these snapshots and LUNs run from?   The target VNX or some hybrid of VNX + RecoverPoint.

I have not been able to find any article on this architecture.

Thanks,

-MattG

-MattG If you find this information useful, please award points for "correct" or "helpful".
0 Kudos
1 Solution

Accepted Solutions
TimOudin
Hot Shot
Hot Shot
Jump to solution

Correct Matt, all block storage is provided by your VNX and all I/O will ultimately be serviced by either journal or source/copy data LUN.  Apologies for any confusion!  On performance, take a look at this blog post.  The author does a great job of outline the fairly unique I/O behavior of RecoverPoint and it's journal LUN, something to definitely consider when designing VNX pools.  I personally still stick with standard FLARE LUN for journals, and never provision journals from a pool hosting other workloads unless I have to.

EMC RecoverPoint Journal and Replica Volume – Performance Considerations | Dave Ring

Tim Oudin

View solution in original post

0 Kudos
5 Replies
TimOudin
Hot Shot
Hot Shot
Jump to solution

Tests and real failover are treated very differently, but for the differences to make sense you'll have to understand the different image access modes offered by RecoverPoint (apologies if you already do...).  You don't mention the version, so I'm going to grab some references from the latest doc (that I'm aware of...) on RecoverPoint 4.0, Administrators Guide Rev A02 (August '13).  Image Access modes are covered on page 60-62 of the admin guide, I'd highly recommend taking a look at it.  Even if your version of RecoverPoint is not the same, the concepts of image access are.  Your VNX is not involved in this in any way aside from being a really good splitter, no snapshots, etc., though some of the features require the VNX splitter...but that's a whole different conversation.  :smileysilly:

RecoverPoint (version 3.2 and later) uses Logged Access mode (page 60) when a test of a Recovery Plan is initiated.  Basically, RecoverPoint will expose a snapshot (a RecoverPoint snapshot, not VNX), or image, to the host in read-only state.  The selected image will be rolled to copy storage, see the section on "Roll image in background" on page 212 in the doc for a more clear explanation.  Note that I said read-only state above, sort of a key bit here.  RecoverPoint will redirect incoming writes to a portion of the journal ("image access log", default is 20% of the total journal capacity) for each Consistency Group involved in the test.  This implies that the duration in which you can effectively test with SRM, or the amount of writes that need to be performed, are limited by your journal capacity in each Consistency Group.  If your cluster default behavior is to storage VMkernel swap files with the VMs then you'll start chewing into your available capacity pretty quickly.

For actual failover the process is very different, and much simpler.  Default behavior of SRM is to have RecoverPoint the last snapshot and immediately rolls it to disk, then enable read-write access to hosts.  All other snapshots in the journal are discarded. I haven't yet worked with SRM 5.5 in the multiple point in time recovery feature.

Does that help?

Tim Oudin
0 Kudos
MattG
Expert
Expert
Jump to solution

Tim,

Thanks for the in depth explanation!

Questions:

  • You mention that VNX is not involved in anyway.   The VMs are running from LUNs presented from the RecoverPoint appliance or does RP present the replicas on the VNX.  I thought that the RecoverPoint doesn't actually store the data locally but on the local SAN?
  • Do not actual failover's use the VNX replica managed by Recoverpoint?

Thanks,

-MattG

-MattG If you find this information useful, please award points for "correct" or "helpful".
0 Kudos
TimOudin
Hot Shot
Hot Shot
Jump to solution

MattG wrote:

  • You mention that VNX is not involved in anyway.   The VMs are running from LUNs presented from the RecoverPoint appliance or does RP present the replicas on the VNX.  I thought that the RecoverPoint doesn't actually store the data locally but on the local SAN?
  • Do not actual failover's use the VNX replica managed by Recoverpoint?

It sounded much more clear in my head while writing, let me that part again.  When I say the VNX isn't involved I really mean something more like the *features* of the VNX (i.e. snapshots, clones, etc) are not used by RecoverPoint.  Just trying to differentiate the usage of RecoverPoint snapshots vs. your comment on "SAN snapshots", that's all.  The images presented to the ESXi hosts during testing are coming from the snapshots stored on the RecoverPoint journal LUN, and the LUN used as replication target.  You are absolutely correct on the second bullet point.

Tim Oudin
0 Kudos
MattG
Expert
Expert
Jump to solution

Isn't the Journal backed by a SAN like the VNX?  So in actuality the test is running from VNX storage?


I am trying to clarify so as I want to make sure that I am getting the desired performance when testing failovers with 300 VMs.

Thanks,

-MattG

-MattG If you find this information useful, please award points for "correct" or "helpful".
0 Kudos
TimOudin
Hot Shot
Hot Shot
Jump to solution

Correct Matt, all block storage is provided by your VNX and all I/O will ultimately be serviced by either journal or source/copy data LUN.  Apologies for any confusion!  On performance, take a look at this blog post.  The author does a great job of outline the fairly unique I/O behavior of RecoverPoint and it's journal LUN, something to definitely consider when designing VNX pools.  I personally still stick with standard FLARE LUN for journals, and never provision journals from a pool hosting other workloads unless I have to.

EMC RecoverPoint Journal and Replica Volume – Performance Considerations | Dave Ring

Tim Oudin
0 Kudos