VMware Cloud Community
dekkar
Contributor
Contributor
Jump to solution

Only allowing replication at certain times

Hi all..... Ive just setup replication between our two offices.

I kind of understand this RPO business, and have set our VMs to 24hr for now, as our DMS servers have problems if replication kicks off during business hours.

I have noticed the DMS servers are having these replication issues during the day. My question is; Is there a way to specify a time or window for replication to run. It only takes around 30 mins to run. But it seems to 'do things' whenever it wants....

or even specify a time when replication cant run?

Thanks,

Nathan

1 Solution

Accepted Solutions
Smoggy
VMware Employee
VMware Employee
Jump to solution

As has been said as a rule the impact on the source VM when replication is configured is usually negligible. There are small host overheads to absorb during replication (cpu/memory and of course network) but again these will typically be negligible although if you went to the extreme end of the scale and had a large number of high change rate VM's all performing full sync tasks at same time on same host you would probably notice that.

We really need more details on why you think configuring replication is the cause of your issue.....is it coinciding with snapshot based backups for example? also be useful to know vc/esxi and vr versions you have deployed.

View solution in original post

0 Kudos
9 Replies
vilinski
VMware Employee
VMware Employee
Jump to solution

Hi Nathan,

Can you elaborate on the replication issues the DMS servers are having? Unfortunately you have no control over when "things" will be done.

Smoggy
VMware Employee
VMware Employee
Jump to solution

As has been said as a rule the impact on the source VM when replication is configured is usually negligible. There are small host overheads to absorb during replication (cpu/memory and of course network) but again these will typically be negligible although if you went to the extreme end of the scale and had a large number of high change rate VM's all performing full sync tasks at same time on same host you would probably notice that.

We really need more details on why you think configuring replication is the cause of your issue.....is it coinciding with snapshot based backups for example? also be useful to know vc/esxi and vr versions you have deployed.

0 Kudos
dekkar
Contributor
Contributor
Jump to solution

Hi guys, thanks for the reply.

The problems we have with the DMS is that its integration with our case management drops in and out when snapshots etc are made/merged while it is being heavily used.

For example lets say 100 cases a day are created, the DMS should automatically create a 'workspace' for each case created. When replication is enabled, half of these dont get created, so the user sits there waiting for it, until they get fed up and asks IT why it wasn't created....

I do currently have Veeam and we were testing this as a replication solution. But the snapshotting (or more so the merging of the snapshot) was causing problems with the environment.

After a fair bit of research, it was discovered that the DMS (Interwoven Worksite) has documented problems with this, and it isn't something we can overcome. It was advised that block level replication was the only alternative.

I was under the impression that the vSphere Replication was block level, but maybe we need to look at block level at the SAN level........

I upgraded our environment to the latest 5.5 around a month ago.

Thanks,

Nathan

0 Kudos
Smoggy
VMware Employee
VMware Employee
Jump to solution

hi Nathan its not completely clear from your last reply so forgive me for asking a pedantic question. You our say your are testing veeam and are seeming issues related to snapshots. couple of points I wanted to check:

- when you see the issue with vsphere replication is veeam still running and configured for that same dsm vm?

- when you configure replication are you ticking the box to include vss quiescing?

vSphere Replication (VR) does not use snapshots in the common sense when protecting vms. When the vms are in use VR is using an internal vmkernel API in our hypervisor to track changed blocks in a bitmap that is then backed on disk in case of host crash (you see these as .psf persistent state files). The contents of this bitmap are simply pointers to the blocks that have changed since the last transfer. The record in the bitmap does not actually contain the data, it's a pointer. So this is not a snapshot. At transfer time we simply use the block/pointer list from the bitmap to figure out which blocks we need to send that set of blocks is referred to as a light wight delta (LWD) but the creation of that is also not a snapshot and there is no vm stun or such like. With the default replication options the only time we need to create a redo log during a transfer is in the rare case a vm tries to modify one of the blocks in our list before we have had chance to transfer it across the wan. In that case we simply read original data from the block first and copy it to the redo log so that the contents of the whole transfer are write order consistent.

if you have elected to invoke vss as part of your VR replication configuration for that vm then vm levels snapshot  will also be triggered at transfer time as vss uses these to quiesce the vm. If the specific vm being replicated has an exceptionally high unique block change rate the the time it takes for microsoft vss to quiesce the vm is not something we can control that is a function of vss. Without vss option obviously all VR transfers are crash consistent only however for most operating systems and applications recovery from crash consistent disk states in conjunction with application and file system logging is something most application stacks can easily deal with.

so a couple of quick checks would be:

- make sure your are not testing VR and veeam on same vm at same time

- check to see if you have ticked the vss checkbox during VR configuration for that vm. If yes run further test with it unchecked as a comparison. If this changes behaviour would be good for us to get a call raised so we can look at the logs and see if there is something odd going on with vss and VR.

AndThen
Contributor
Contributor
Jump to solution

Congrats on the upgrade to 5.5 .  We're planning to do the same thing; upgrade from 4.1 to 5.5 then do SRM with SAN replication.  Good to know about vShere replication. Our DMS is Worldox.

0 Kudos
dekkar
Contributor
Contributor
Jump to solution

Hi Smoggy, thanks for the details..... Good to get a confirmation that the replication is happening at that low level.

I still do have a veeam replication task that is running at 3am. (To a ESXi server on the LAN with DAS) I was presuming that it wouldn't interfere, but I can disable this.

I did tick the VSS option, thinking that this was the norm. At the moment I have created the seed for replication. I am going to our remote site next week, where I will finish this process, so we can have some real world replication going across our 100Mb link.

At the moment, I have stopped vSphere replication from happening. Now waiting for the seeded replication to take place.

Once this is all in place, I will continue with your suggestions and update the thread.

In regards to VSS, I dont believe that the VM is doing too many transactions. Worksite has 3 VM servers, and the only one I am testing at the moment is the "document store" server. So there is no SQL or any processing happening on this server. I am not the Worksite administrator however, so I'm not a pro at it.

Another question, our hosts are 5 years old (due for upgrade) and our SAN is also at that level. Could performance constraints on the hardware be causing these problems (rather than transactions within the software?)

There is no contention with CPU or RAM at all, so I'm thinking not. The SAN is running around 25 VMs, including SQL and Exchange servers.

Thanks,

Nathan

0 Kudos
dekkar
Contributor
Contributor
Jump to solution

Ran replication over the weekend..... Currently I have it paused during business hours, and will unpause it tonight allow replication.

Being able to pause / un-pause on a schedule would be handy!

Once we are confident that this isn't causing problems, I'll leave the replication task unpaused and see how it goes.

0 Kudos
dekkar
Contributor
Contributor
Jump to solution

Hi All.... so far so good...

Most night the replication job takes around 2-10 mins, with an sync size of around 1.5-3 GB.....

Haven't had any problems since the original full sync, but I do try and keep the sync happening at night time just to be on the safe side.

I have had no complaints in regards to Worksite.... So its looking like a success....! Next thing is to test the replica!

Thanks for the help with this.

0 Kudos
blabarbera
Enthusiast
Enthusiast
Jump to solution

I do agree that it would be helpful to have a way to specify a replication schedule. We are running vSphere Replication and also using Veeam. I don't care for crash consistent backups, so I usually always use quiesced snapshots.

The adhoc replication schedule causes havoc if it happens to coincide with a Veeam job.

0 Kudos