VMware Cloud Community
Le_Dude
Contributor
Contributor

SRM and Asynchronous replication

Hello,

We are currently designing a new VI3.5 environment spread over 2 datacenter about 60km from eachother. On the Production site we have like 13 TB which needs to be synced with the DRP site.

Implemening a full synchronous data replication would involve adding at least a 4 Gigabit FC line which increases the budget enormously.

We are looking in the combination of SRM in combination with an asynchronous replication. Since we only worked with Sync mirror before we were wondering

how the VM's will handle in an Async environment. So, does anybody have a similar setup? How is data consistency doing when a failover occurs?

0 Kudos
2 Replies
bladeraptor
VMware Employee
VMware Employee

Hi

I am writing this as an EMC employee.

We and several other vendors offer Asynchronous solutions for remote replication.

These range from FCIP Asynchronous and iSCSI Asynchronous solutions on our CLARiiON / Symmetrix products to Asynchronous replication of NFS and iSCSI via our Celerra IP storage platform

The question around Asynchronous is not so much about the impact on the VMware Guest OSs but the impact on your Recovery Time Objectives (RTO) and Recovery Point Objectives (RPO).

Asynchronous is typically a solution that provides a recovery point objective (RPO) of a minute or greater - typically we would suggest a 10-15 minute period minimum out of synch with the production site for the Celerra and CLARiiON solutions. This reflects the use of snapshotting technology to capture the write block changes which are then pushed across to the recovery site.

So you will need to understand to what point in your production daily cycle you need to recover to.

In terms of VMware Guest OS recoverability, ideally the chosen Asynchronous solution should offer write order consistency and in VMware SRM terms, integration with SRM through an VMware approved Site recovery Adapter (SRA).

EMC offers SRAs for all our replication products both Asynchronous, Synchronous and Journaled.

In terms of application state as SRM anticipates that the primary production site has been removed off the face of the earth it will be looking to recover from the latest copy of the produiction data on the recovery site and this is where the RPO and write order consistency comes in. These two elements will influence your ability to recover the environment and restart the VMware Guest OSs which will restart from a state that is at best Virtual Machine consistent and more likely the equivalent of crash consistent - or the equivalent of the state a physical machine will recover from after you pull its power supply.

This should be exactly the same scenario for a synchronous solution aside from a difference in RPO under the SRM framework

Regards

Alex Tanner

0 Kudos
dergin
Contributor
Contributor

HI,

I am currently installing SRM over 2 sites with distance of over 400KM in production environment. Initial set up consist of 20 VMs with 2TB of data. we have 100meg link between the 2 sites. it is asynchronous replication running twice a day. Apart from the first replication between the 2 sites taking very long time follow up replications have no impact on performance. As previously suggested you should adjust the replication frequency according to RPO.

VMs are failing over in a crash consistency state and I have not seen any data integrity problems after a few tests.

Hope this helps.

0 Kudos