VMware Cloud Community
ysfpatrick1
Contributor
Contributor

SRM Failback use long long time

I am using VSphere replication to replicate a VM which has 100GB data.

In SRM, to execute the recovery plan just used around 4 min, the VM running in DR site without any problem.

However, when I would like to failback this VM
1. I click reprotect....used around 6 hours to initial full Sync from site B to site A
2. Then use 5 min to recovery the VM from site B back to site A
3. Finally use 6 hours to initial full sync from site A to site B

So, Total I used around 12 hours to failback the VM to it original site.

I would ask it is normal to take much time for initial sync?  If use storage array replication, will the initial sync take little time?

I still has some protected VM which with 800GB data disk and I think I need 1 week to failback the VM.

0 Kudos
4 Replies
mvalkanov
VMware Employee
VMware Employee

Hi,

Improving the VR performance during SRM Reprotect is on the roadmap for a future release.

For current releases there have been optimizations in the checksum calculation during the initial full-sync phase - in 5.1 and additional ones in 5.5.

Which VR version and which ESXi version are you using? Some of the performance optimizations require newer ESXi version.

Regards,

Martin

0 Kudos
ysfpatrick1
Contributor
Contributor

Hi Martin,

Now, I am using ESXi 5.5.0 1746018, VR 5.5.1.0 Build 1618023 and SRM 5.5.1.

Is that means if I use Array based replication, the time for full-sync phase can be reduce more?

Thanks

0 Kudos
mvalkanov
VMware Employee
VMware Employee

Hi,

About array-based replication - the time for the full-sync phase during reprotect will depend on the actual storage array.

For vSphere Replication - current releases are not optimized for large thin-provisioned disks. This has already been improved, but not released yet.

0 Kudos
vNEX
Expert
Expert

Hi,

have a look at the KB article below it provides some information how to calculate required bandwidth for VR:

VMware KB: Calculating Bandwidth Requirements for vSphere Replication

Regarding questions:

I would ask it is normal to take much time for initial sync?  If use storage array replication, will the initial sync take little time?

I still has some protected VM which with 800GB data disk and I think I need 1 week to failback the VM.

You must take into account that VR replication is based on hypervisor layer (i.e. host network) data are transferred via NFC over management network or other dedicated network... which is much higher layer than storage subsystem layer which

generally utilize Fibre Channel SAN for data transport and is able to provide synchronous replication in opposite to VR minimum RPO of 15 minutes.... although VR engineering team continuously enhancing VR repl. speed array-based replication

is always much quicker and more robust solution in these days.

VR replication time depends on network bandwidth dedicated for it so have a look at article above and perform some monitoring of that dedicated link how much is saturated by other traffic what is the latency between sites look for any dropped packets etc...

Same applies for source and destination storage replicated between sites ... during reprotect and full-sync operation.

Check VM Events for any RPO violations (like network connectivity issues, low bandwidth, datastore access problems...)

or explore host vmkernel.log and VR appliance logs for any warnings/errors see:

VMware KB: Collecting the VMware vSphere Replication logs

In addition you can try to use replication seeds to reduce the time and bandwidth for initial full-sync replication phase.

Replicate Virtual Machines By Using Replication Seeds

_________________________________________________________________________________________ If you found this or any other answer helpful, please consider to award points. (use Correct or Helpful buttons) Regards, P.
0 Kudos