VMware Cloud Community
victorkh
Enthusiast
Enthusiast

VDP backups replication is too slow.

I have deployed a second VDP appliance v6.1 in a DR site and want to use it as a replication target. I have noticed that the appliance is not pushing hard and is not sending a lot of data through the link. The replication is happening on the weekend when no one is using the link between the sites and we have 1Gb connection between the two sites. I reached out to our network guy and i asked him to monitor the traffic between the two appliances, He told me that we have enough bandwidth, but looks like the appliance is not pushing hard enough or the appliance is not sending a lot of data through the link. He mentioned something like 1Mb/s as a transfer rate. The replication job starts then stuck on 92% in the source site and 98% in the distention site and the replication task fails if it took it more than 24 hours.

Any one had the same issue? Is there a limitation set on the appliance or a configuration file that we can modify to make it send more data than 1Mb/s ?

0 Kudos
10 Replies
SG1234
Enthusiast
Enthusiast

is network io control enabled? have you checked graphs on the VM side ...?

0 Kudos
victorkh
Enthusiast
Enthusiast

Thank you SG1234 for the respond.. Network IO control is enabled and the shares are set, IO control works when there is a network contention which i don't think is the case especially we are replicating over on the weekend when no network activities are happening. I have 4 vCPUs and 8 GB Memory configured on the appliance. What graph on the VM side do you want me to check?

0 Kudos
vdp4life
Enthusiast
Enthusiast

So the 92% stuck is a common thing. The only real way to tell the progress is to use the mccli.

My advice would be to just narrow down your replication to 1 VM and a handful of backups. Let that finish and add the rest.

Also, Are you replicating over a vpn tunnel ? wondering why you are only seeing 1mbit.

0 Kudos
RyanJMN
Enthusiast
Enthusiast

VDP comes with iperf already installed.  I recommend running an iperf test between appliances to confirm you have the expected bandwidth.  Are you replicating during the maintenance window?  Have you run a disk performance test?

0 Kudos
vdp4life
Enthusiast
Enthusiast

were you able to figure it out?

I had a friend of mine having horrible performance but that was due to firewall ips slowing down the traffic.

0 Kudos
gbone8106
Contributor
Contributor

I am having this issue also. we have a 10gb pipe between the 2 sites, however replications are taking 10-20 hours, which is the same amount of time it took when the pipe was 1b. Any ideas?

0 Kudos
RyanJMN
Enthusiast
Enthusiast

What storage is VDP running on?  Have you run a VDP disk performance test? If so did it pass?

0 Kudos
gbone8106
Contributor
Contributor

Each of our appliances are attached to a data domain (DD4200) at each site with 10gb network cards connected to 10gb switch along with the 10gb network pipe.

0 Kudos
RyanJMN
Enthusiast
Enthusiast

I have an environment with DD670 and 10 gig with Avamar that is replicating around 40TB in 6 hours along with a lot of mtree replication happening at the same time.  How much data are you replicating?  I would start with confirming your running DDOS 5.5 or newer and don't have any replication bandwidth throttles configured.  5.5 introduced synthetic aware managed file replication which will significantly speed up replication.

0 Kudos
victorkh
Enthusiast
Enthusiast

Still having the same issue.. We are not replicating over a VPN. It is a point to point connection. We don't use ips on this connection. We have these appliances on SAN storage connected through a fiber channel. Anyway running a storage performance test returns pass result so no issue with the storage. Opened a ticket with the VMware support to tell me it is the Blades NIC drivers version which doesn't make any sense at all since we have VMs using the same uplinks to send data that are bigger than the amount i am trying to replicate with less time. Anyway upgrading the drivers and firmware on the NICs didn't help as well. Not sure what is causing the appliance not to push enough data through the connection.. We are not suing data domain here. the total amount of storage i am trying to replicate is less than a TB or 1.5 TB max. not sure why it is taking  20 hours or sometimes 24 hours before the task is killed.

For maintenance window: Of course replication will overlap at some point with the maintenance window if it doesn't finish in 20 hours.

0 Kudos